class URI
Overview
This class represents a URI reference as defined by RFC 3986: Uniform Resource Identifier (URI): Generic Syntax.
This class provides constructors for creating URI instances from their components or by parsing their string forms and methods for accessing the various components of an instance.
Basic example:
require "uri"
uri = URI.parse "http://foo.com/posts?id=30&limit=5#time=1305298413"
# => #<URI:0x1003f1e40 @scheme="http", @host="foo.com", @port=nil, @path="/posts", @query="id=30&limit=5", ... >
uri.scheme # => "http"
uri.host # => "foo.com"
uri.query # => "id=30&limit=5"
uri.to_s # => "http://foo.com/posts?id=30&limit=5#time=1305298413"
Resolution and Relativization
Resolution is the process of resolving one URI against another, base URI.
The resulting URI is constructed from components of both URIs in the manner specified by
RFC 3986 section 5.2, taking components
from the base URI for those not specified in the original.
For hierarchical URIs, the path of the original is resolved against the path of the base
and then normalized. See #resolve
for examples.
Relativization is the inverse of resolution as that it procudes an URI that resolves to the original when resolved against the base.
For normalized URIs, the following is true:
a.relativize(a.resolve(b)) # => b
a.resolve(a.relativize(b)) # => b
This operation is often useful when constructing a document containing URIs that must be made relative to the base URI of the document wherever possible.
URL Encoding
This class provides a number of methods for encoding and decoding strings using
URL Encoding (also known as Percent Encoding) as defined in RFC 3986
as well as x-www-form-urlencoded
.
Each method has two variants, one returns a string, the other writes directly to an IO.
.decode(string : String, *, plus_to_space : Bool = false) : String
: Decodes a URL-encoded string..decode(string : String, io : IO, *, plus_to_space : Bool = false) : Nil
: Decodes a URL-encoded string to an IO..encode(string : String, *, space_to_plus : Bool = false) : String
: URL-encodes a string..encode(string : String, io : IO, *, space_to_plus : Bool = false) : Nil
: URL-encodes a string to an IO..decode_www_form(string : String, *, plus_to_space : Bool = true) : String
: Decodes anx-www-form-urlencoded
string component..decode_www_form(string : String, io : IO, *, plus_to_space : Bool = true) : Nil
: Decodes anx-www-form-urlencoded
string component to an IO..encode_www_form(string : String, *, space_to_plus : Bool = true) : String
: Encodes a string as ax-www-form-urlencoded
component..encode_www_form(string : String, io : IO, *, space_to_plus : Bool = true) : Nil
: Encodes a string as ax-www-form-urlencoded
component to an IO.
The main difference is that .encode_www_form
encodes reserved characters
(see .reserved?
), while .encode
does not. The decode methods are
identical except for the handling of +
characters.
NOTE HTTP::Params
provides a higher-level API for handling x-www-form-urlencoded
serialized data.
Defined in:
uri/punycode.cruri/uri_parser.cr
uri/encoding.cr
uri.cr
Constructors
- .new(scheme = nil, host = nil, port = nil, path = "", query = nil, user = nil, password = nil, fragment = nil)
-
.parse(raw_url : String) : URI
Parses the given raw_url into an URI.
Class Method Summary
-
.decode(string : String, io : IO, *, plus_to_space : Bool = false) : Nil
URL-decodes a string and writes the result to io.
-
.decode(string : String, *, plus_to_space : Bool = false) : String
URL-decodes string.
-
.decode(string : String, io : IO, *, plus_to_space : Bool = false, &) : Nil
URL-decodes string and writes the result to io.
-
.decode_www_form(string : String, io : IO, *, plus_to_space : Bool = true) : Nil
URL-decodes string as
x-www-form-urlencoded
and writes the result to io. -
.decode_www_form(string : String, *, plus_to_space : Bool = true) : String
URL-decodes string as
x-www-form-urlencoded
. -
.default_port(scheme : String) : Int32?
Returns the default port for the given scheme if known, otherwise returns
nil
. -
.encode(string : String, io : IO, *, space_to_plus : Bool = false) : Nil
URL-encodes string and writes the result to io.
-
.encode(string : String, *, space_to_plus : Bool = false) : String
URL-encodes string.
-
.encode(string : String, io : IO, space_to_plus : Bool = false, &) : Nil
URL-encodes string and writes the result to an
IO
. -
.encode_www_form(string : String, *, space_to_plus : Bool = true) : String
URL-encodes string as
x-www-form-urlencoded
. -
.encode_www_form(string : String, io : IO, *, space_to_plus : Bool = true) : Nil
URL-encodes string as
x-www-form-urlencoded
and writes the result to io. -
.escape(string : String, io : IO, space_to_plus = false)
DEPRECATED Use .encode or .encode_www_form instead
-
.escape(string : String, space_to_plus = false)
DEPRECATED Use .encode or .encode_www_form instead
-
.reserved?(byte) : Bool
Returns whether given byte is reserved character defined in RFC 3986.
-
.set_default_port(scheme : String, port : Int32?) : Nil
Registers the default port for the given scheme.
-
.unescape(string : String, plus_to_space = false)
DEPRECATED Use .decode or .decode_www_form instead
-
.unescape(string : String, io : IO, plus_to_space = false)
DEPRECATED Use .decode or .decode_www_form instead
-
.unreserved?(byte) : Bool
Returns whether given byte is unreserved character defined in RFC 3986.
Instance Method Summary
-
#==(other : self)
Returns
true
if this reference is the same as other. -
#absolute? : Bool
Returns
true
if URI has a scheme specified. -
#fragment : String?
Returns the fragment component of the URI.
-
#fragment=(fragment : String?)
Sets the fragment component of the URI.
-
#full_path : String
Returns the full path of this URI.
- #hash(hasher)
-
#host : String?
Returns the host component of the URI.
-
#host=(host : String?)
Sets the host component of the URI.
-
#hostname
Returns the host part of the URI and unwrap brackets for IPv6 addresses.
-
#normalize : URI
Returns a normalized copy of this URI.
-
#normalize! : URI
Normalizes this URI instance.
-
#opaque? : Bool
Returns
true
if this URI is opaque. -
#password : String?
Returns the password component of the URI.
-
#password=(password : String?)
Sets the password component of the URI.
-
#path : String
Returns the path component of the URI.
-
#path=(path : String)
Sets the path component of the URI.
-
#port : Int32?
Returns the port component of the URI.
-
#port=(port : Int32?)
Sets the port component of the URI.
-
#query : String?
Returns the query component of the URI.
-
#query=(query : String?)
Sets the query component of the URI.
-
#query_params : HTTP::Params
Returns a
HTTP::Params
of the URI#query. -
#relative? : Bool
Returns
true
if URI does not have a scheme specified. -
#relativize(uri : URI | String) : URI
Relativizes uri against this URI.
-
#resolve(uri : URI | String) : URI
Resolves uri against this URI.
-
#scheme : String?
Returns the scheme component of the URI.
-
#scheme=(scheme : String?)
Sets the scheme component of the URI.
-
#to_s(io : IO) : Nil
Appends a short String representation of this object which includes its class name and its object address.
-
#user : String?
Returns the user component of the URI.
-
#user=(user : String?)
Sets the user component of the URI.
-
#userinfo
Returns the user-information component containing the provided username and password.
Instance methods inherited from class Reference
==(other : self)==(other : JSON::Any)
==(other : YAML::Any)
==(other) ==, dup dup, hash(hasher) hash, inspect(io : IO) : Nil inspect, object_id : UInt64 object_id, pretty_print(pp) : Nil pretty_print, same?(other : Reference)
same?(other : Nil) same?, to_s(io : IO) : Nil to_s
Constructor methods inherited from class Reference
new
new
Instance methods inherited from class Object
! : Bool
!,
!=(other)
!=,
!~(other)
!~,
==(other)
==,
===(other : JSON::Any)===(other : YAML::Any)
===(other) ===, =~(other) =~, as(type : Class) as, as?(type : Class) as?, class class, dup dup, hash
hash(hasher) hash, inspect(io : IO) : Nil
inspect : String inspect, is_a?(type : Class) : Bool is_a?, itself itself, nil? : Bool nil?, not_nil! not_nil!, pretty_inspect(width = 79, newline = "\n", indent = 0) : String pretty_inspect, pretty_print(pp : PrettyPrint) : Nil pretty_print, responds_to?(name : Symbol) : Bool responds_to?, tap(&) tap, to_json(io : IO)
to_json to_json, to_pretty_json(indent : String = " ")
to_pretty_json(io : IO, indent : String = " ") to_pretty_json, to_s : String
to_s(io : IO) : Nil to_s, to_yaml(io : IO)
to_yaml to_yaml, try(&) try, unsafe_as(type : T.class) forall T unsafe_as
Class methods inherited from class Object
from_json(string_or_io, root : String)from_json(string_or_io) from_json, from_yaml(string_or_io : String | IO) from_yaml
Constructor Detail
Parses the given raw_url into an URI. The raw_url may be relative or absolute.
require "uri"
uri = URI.parse("http://crystal-lang.org") # => #<URI:0x1068a7e40 @scheme="http", @host="crystal-lang.org", ... >
uri.scheme # => "http"
uri.host # => "crystal-lang.org"
Class Method Detail
URL-decodes a string and writes the result to io.
See .decode(string : String, *, plus_to_space : Bool = false) : String
for details.
URL-decodes string.
require "uri"
URI.decode("hello%20world!") # => "hello world!"
URI.decode("put:%20it+%D0%B9") # => "put: it+й"
URI.decode("http://example.com/Crystal%20is%20awesome%20=)") # => "http://example.com/Crystal is awesome =)"
By default, +
is decoded literally. If plus_to_space is true
, +
is
decoded as space character (0x20
). Percent-encoded values such as %20
and %2B
are always decoded as characters with the respective codepoint.
require "uri"
URI.decode("peter+%2B+paul") # => "peter+++paul"
URI.decode("peter+%2B+paul", plus_to_space: true) # => "peter + paul"
.encode
is the reverse operation..decode_www_form
encodes plus to space by default.
URL-decodes string and writes the result to io.
The block is called for each percent-encoded ASCII character and determines whether the value is to be decoded. When the return value is falsey, the character is decoded. Non-ASCII characters are always decoded.
By default, +
is decoded literally. If plus_to_space is true
, +
is
decoded as space character (0x20
).
This method enables some customization, but typical use cases can be implemented
by either .decode(string : String, *, plus_to_space : Bool = false) : String
or
.deode_www_form(string : String, *, plus_to_space : Bool = true) : String
.
URL-decodes string as x-www-form-urlencoded
and writes the result to io.
See self.decode_www_form(string : String, *, plus_to_space : Bool = true) : String
for details.
URL-decodes string as x-www-form-urlencoded
.
require "uri"
URI.decode_www_form("hello%20world!") # => "hello world!"
URI.decode_www_form("put:%20it+%D0%B9") # => "put: it й"
URI.decode_www_form("http://example.com/Crystal+is+awesome+=)") # => "http://example.com/Crystal is awesome =)"
By default, +
is decoded as space character (0x20
). If plus_to_space
is false
, +
is decoded literally as +
. Percent-encoded values such as
%20
and %2B
are always decoded as characters with the respective codepoint.
require "uri"
URI.decode_www_form("peter+%2B+paul") # => "peter + paul"
URI.decode_www_form("peter+%2B+paul", plus_to_space: false) # => "peter+++paul"
.encode_www_form
is the reverse operation..decode
encodes plus literally by default.
Returns the default port for the given scheme if known,
otherwise returns nil
.
require "uri"
URI.default_port "http" # => 80
URI.default_port "ponzi" # => nil
URL-encodes string and writes the result to io.
See .encode(string : String, *, space_to_plus : Bool = false) : String
for details.
URL-encodes string.
Reserved and unreserved characters are not escaped, so this only modifies some
special characters as well as non-ASCII characters. .reserved?
and .unreserved?
provide more details on these character classes.
require "uri"
URI.encode("hello world!") # => "hello%20world!"
URI.encode("put: it+й") # => "put:%20it+%D0%B9"
URI.encode("http://example.com/Crystal is awesome =)") # => "http://example.com/Crystal%20is%20awesome%20=)"
By default, the space character (0x20
) is encoded as %20
and +
is
encoded literally. If space_to_plus is true
, space character is encoded
as +
and +
is encoded as %2B
:
require "uri"
URI.encode("peter + paul") # => "peter%20+%20paul"
URI.encode("peter + paul", space_to_plus: true) # => "peter+%2B+paul"
.decode
is the reverse operation..encode_www_form
also escapes reserved characters.
URL-encodes string and writes the result to an IO
.
The block is called for each ascii character (codepoint less than 0x80
) and
determines whether the value is to be encoded. When the return value is falsey,
the character is encoded. Non-ASCII characters are always encoded.
By default, the space character (0x20
) is encoded as %20
and +
is
encoded literally. If space_to_plus is true
, space character is encoded
as +
and +
is encoded as %2B
.
This method enables some customization, but typical use cases can be implemented
by either .encode(string : String, *, space_to_plus : Bool = false) : String
or
.encode_www_form(string : String, *, space_to_plus : Bool = true) : String
.
URL-encodes string as x-www-form-urlencoded
.
Reserved characters are escaped, unreserved characters are not.
.reserved?
and .unreserved?
provide more details on these character
classes.
require "uri"
URI.encode_www_form("hello world!") # => "hello+world%21"
URI.encode_www_form("put: it+й") # => "put%3A+it%2B%D0%B9"
URI.encode_www_form("http://example.com/Crystal is awesome =)") # => "http%3A%2F%2Fexample.com%2FCrystal+is+awesome+%3D%29"
The encoded string returned from this method can be used as name or value
components for a application/x-www-form-urlencoded
format serialization.
HTTP::Params
provides a higher-level API for this use case.
By default, the space character (0x20
) is encoded as +
and +
is encoded
as %2B
. If space_to_plus is false
, space character is encoded as %20
and '+'
is encoded literally.
require "uri"
URI.encode_www_form("peter + paul") # => "peter+%2B+paul"
URI.encode_www_form("peter + paul", space_to_plus: false) # => "peter%20%2B%20paul"
.decode_www_form
is the reverse operation..encode
does not escape reserved characters.
URL-encodes string as x-www-form-urlencoded
and writes the result to io.
See .encode_www_form(string : String, *, space_to_plus : Bool = true)
for
details.
DEPRECATED Use .encode or .encode_www_form instead
DEPRECATED Use .encode or .encode_www_form instead
Returns whether given byte is reserved character defined in RFC 3986.
Reserved characters are ':', '/', '?', '#', '[', ']', '@', '!', '$', '&', "'", '(', ')', '*', '+', ',', ';' and '='.
Registers the default port for the given scheme.
If port is nil
, the existing default port for the
scheme, if any, will be unregistered.
require "uri"
URI.set_default_port "ponzi", 9999
DEPRECATED Use .decode or .decode_www_form instead
DEPRECATED Use .decode or .decode_www_form instead
Returns whether given byte is unreserved character defined in RFC 3986.
Unreserved characters are ASCII letters, ASCII digits, _
, .
, -
and ~
.
Instance Method Detail
Returns true
if this reference is the same as other. Invokes same?
.
Returns the fragment component of the URI.
require "uri"
URI.parse("http://foo.com/bar#section1").fragment # => "section1"
Returns the full path of this URI.
require "uri"
uri = URI.parse "http://foo.com/posts?id=30&limit=5#time=1305298413"
uri.full_path # => "/posts?id=30&limit=5"
Returns the host component of the URI.
require "uri"
URI.parse("http://foo.com").host # => "foo.com"
Returns the host part of the URI and unwrap brackets for IPv6 addresses.
require "uri"
URI.parse("http://[::1]/bar").hostname # => "::1"
URI.parse("http://[::1]/bar").host # => "[::1]"
Returns a normalized copy of this URI.
See #normalize!
for details.
Normalizes this URI instance.
The following normalizations are applied to the individual components (if available):
#scheme
is lowercased.#host
is lowercased.#port
is removed if it is the.default_port?
of the scheme.#path
is resolved to a minimal, semantical equivalent representation removing dot segments/.
and/..
.
uri = URI.parse("HTTP://example.COM:80/./foo/../bar/")
uri.normalize!
uri # => "http://example.com/bar/"
Returns true
if this URI is opaque.
A URI is considered opaque if it has a #scheme
but no hierachical part,
i.e. no #host
and the first character of #path
is not a slash (/
).
Returns the password component of the URI.
require "uri"
URI.parse("http://admin:password@foo.com").password # => "password"
Returns the path component of the URI.
require "uri"
URI.parse("http://foo.com/bar").path # => "/bar"
Returns the port component of the URI.
require "uri"
URI.parse("http://foo.com:5432").port # => 5432
Returns the query component of the URI.
require "uri"
URI.parse("http://foo.com/bar?q=1").query # => "q=1"
Returns a HTTP::Params
of the URI#query.
require "uri"
uri = URI.parse "http://foo.com?id=30&limit=5#time=1305298413"
uri.query_params # => HTTP::Params(@raw_params={"id" => ["30"], "limit" => ["5"]})
Relativizes uri against this URI.
An exact copy of uri is returned if
- this URI or uri are
#opaque?
, or - the scheme and authority (
#host
,#port
,#user
,#password
) components are not identical.
Otherwise a new relative hierarchical URI is constructed with #query
and #fragment
components
from uri and with a path component that describes a minimum-difference relative
path from #path
to uri's path.
URI.parse("http://foo.com/bar/baz").relativize("http://foo.com/quux") # => "../quux"
URI.parse("http://foo.com/bar/baz").relativize("http://foo.com/bar/quux") # => "quux"
URI.parse("http://foo.com/bar/baz").relativize("http://quux.com") # => "http://quux.com"
URI.parse("http://foo.com/bar/baz").relativize("http://foo.com/bar/baz#quux") # => "#quux"
This method is the inverse operation to #resolve
(see Resolution and Relativization).
Resolves uri against this URI.
If uri is #absolute?
, or if this URI is #opaque?
, then an exact copy of uri is returned.
Otherwise the URI is resolved according to the specifications in RFC 3986 section 5.2.
URI.parse("http://foo.com/bar/baz").resolve("../quux") # => "http://foo.com/quux"
URI.parse("http://foo.com/bar/baz").resolve("/quux") # => "http://foo.com/quux"
URI.parse("http://foo.com/bar/baz").resolve("http://quux.com") # => "http://quux.com"
URI.parse("http://foo.com/bar/baz").resolve("#quux") # => "http://foo.com/bar/baz#quux"
This method is the inverse operation to #relativize
(see Resolution and Relativization).
Returns the scheme component of the URI.
require "uri"
URI.parse("http://foo.com").scheme # => "http"
URI.parse("mailto:alice@example.com").scheme # => "mailto"
Appends a short String representation of this object which includes its class name and its object address.
class Person
def initialize(@name : String, @age : Int32)
end
end
Person.new("John", 32).to_s # => #<Person:0x10a199f20>
Returns the user component of the URI.
require "uri"
URI.parse("http://admin:password@foo.com").user # => "admin"
Returns the user-information component containing the provided username and password.
require "uri"
uri = URI.parse "http://admin:password@foo.com"
uri.userinfo # => "admin:password"
The return value is URL encoded (see #encode_www_form
).