class StringScanner

Overview

StringScanner provides for lexical scanning operations on a String.

NOTE To use StringScanner, you must explicitly import it with require "string_scanner"

Example

require "string_scanner"

s = StringScanner.new("This is an example string")
s.eos? # => false

s.scan(/\w+/) # => "This"
s.scan(/\w+/) # => nil
s.scan(/\s+/) # => " "
s.scan(/\s+/) # => nil
s.scan(/\w+/) # => "is"
s.eos?        # => false

s.scan(/\s+/) # => " "
s.scan(/\w+/) # => "an"
s.scan(/\s+/) # => " "
s.scan(/\w+/) # => "example"
s.scan(/\s+/) # => " "
s.scan(/\w+/) # => "string"
s.eos?        # => true

s.scan(/\s+/) # => nil
s.scan(/\w+/) # => nil

Scanning a string means remembering the position of a scan offset, which is just an index. Scanning moves the offset forward, and matches are sought after the offset; usually immediately after it.

Method Categories

Methods that advance the scan offset:

Methods that look ahead or behind:

Methods that deal with the position of the offset:

Methods that deal with the last match:

Miscellaneous methods:

Defined in:

string_scanner.cr

Constructors

Instance Method Summary

Instance methods inherited from class Reference

==(other : self)
==(other : JSON::Any)
==(other : YAML::Any)
==(other)
==
, dup dup, hash(hasher) hash, initialize initialize, inspect(io : IO) : Nil inspect, object_id : UInt64 object_id, pretty_print(pp) : Nil pretty_print, same?(other : Reference) : Bool
same?(other : Nil)
same?
, to_s(io : IO) : Nil to_s

Constructor methods inherited from class Reference

new new, unsafe_construct(address : Pointer, *args, **opts) : self unsafe_construct

Class methods inherited from class Reference

pre_initialize(address : Pointer) pre_initialize

Instance methods inherited from class Object

! : Bool !, !=(other) !=, !~(other) !~, ==(other) ==, ===(other : JSON::Any)
===(other : YAML::Any)
===(other)
===
, =~(other) =~, as(type : Class) as, as?(type : Class) as?, class class, dup dup, hash(hasher)
hash
hash
, in?(collection : Object) : Bool
in?(*values : Object) : Bool
in?
, inspect(io : IO) : Nil
inspect : String
inspect
, is_a?(type : Class) : Bool is_a?, itself itself, nil? : Bool nil?, not_nil!(message)
not_nil!
not_nil!
, pretty_inspect(width = 79, newline = "\n", indent = 0) : String pretty_inspect, pretty_print(pp : PrettyPrint) : Nil pretty_print, responds_to?(name : Symbol) : Bool responds_to?, tap(&) tap, to_json(io : IO) : Nil
to_json : String
to_json
, to_pretty_json(indent : String = " ") : String
to_pretty_json(io : IO, indent : String = " ") : Nil
to_pretty_json
, to_s(io : IO) : Nil
to_s : String
to_s
, to_yaml(io : IO) : Nil
to_yaml : String
to_yaml
, try(&) try, unsafe_as(type : T.class) forall T unsafe_as

Class methods inherited from class Object

from_json(string_or_io : String | IO, root : String)
from_json(string_or_io : String | IO)
from_json
, from_yaml(string_or_io : String | IO) from_yaml

Macros inherited from class Object

class_getter(*names, &block) class_getter, class_getter!(*names) class_getter!, class_getter?(*names, &block) class_getter?, class_property(*names, &block) class_property, class_property!(*names) class_property!, class_property?(*names, &block) class_property?, class_setter(*names) class_setter, def_clone def_clone, def_equals(*fields) def_equals, def_equals_and_hash(*fields) def_equals_and_hash, def_hash(*fields) def_hash, delegate(*methods, to object) delegate, forward_missing_to(delegate) forward_missing_to, getter(*names, &block) getter, getter!(*names) getter!, getter?(*names, &block) getter?, property(*names, &block) property, property!(*names) property!, property?(*names, &block) property?, setter(*names) setter

Constructor Detail

def self.new(str : String) #

[View source]

Instance Method Detail

def [](n) : String #

Returns the n-th subgroup in the most recent match.

Raises an exception if there was no last match or if there is no subgroup.

require "string_scanner"

s = StringScanner.new("Fri Dec 12 1975 14:39")
regex = /(?<wday>\w+) (?<month>\w+) (?<day>\d+)/
s.scan(regex) # => "Fri Dec 12"
s[0]          # => "Fri Dec 12"
s[1]          # => "Fri"
s[2]          # => "Dec"
s[3]          # => "12"
s["wday"]     # => "Fri"
s["month"]    # => "Dec"
s["day"]      # => "12"

[View source]
def []?(n) : String | Nil #

Returns the nilable n-th subgroup in the most recent match.

Returns nil if there was no last match or if there is no subgroup.

require "string_scanner"

s = StringScanner.new("Fri Dec 12 1975 14:39")
regex = /(?<wday>\w+) (?<month>\w+) (?<day>\d+)/
s.scan(regex)  # => "Fri Dec 12"
s[0]?          # => "Fri Dec 12"
s[1]?          # => "Fri"
s[2]?          # => "Dec"
s[3]?          # => "12"
s[4]?          # => nil
s["wday"]?     # => "Fri"
s["month"]?    # => "Dec"
s["day"]?      # => "12"
s["year"]?     # => nil
s.scan(/more/) # => nil
s[0]?          # => nil

[View source]
def beginning_of_line? : Bool #

Returns true if the stream is at the beginning of a line and not at EOS.


[View source]
def byte_offset : Int32 #

The byte offset of the scan head. This is distinct from #offset in that it counts raw bytes instead of characters.


[View source]
def check(pattern : String) : String | Nil #

Returns the value that #scan would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("this is a string")
s.offset = 5
s.check(/\w+/) # => "is"
s.check(/\w+/) # => "is"

[View source]
def check(pattern : Char) : String | Nil #

Returns the value that #scan would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("this is a string")
s.offset = 5
s.check(/\w+/) # => "is"
s.check(/\w+/) # => "is"

[View source]
def check(len : Int) : String | Nil #

Returns the value that #scan would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("this is a string")
s.offset = 5
s.check(/\w+/) # => "is"
s.check(/\w+/) # => "is"

[View source]
def check(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : String | Nil #

Returns the value that #scan would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("this is a string")
s.offset = 5
s.check(/\w+/) # => "is"
s.check(/\w+/) # => "is"

[View source]
def check_until(pattern : String) : String | Nil #

Returns the value that #scan_until would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("test string")
s.check_until(/tr/) # => "test str"
s.check_until(/g/)  # => "test string"

[View source]
def check_until(pattern : Char) : String | Nil #

Returns the value that #scan_until would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("test string")
s.check_until(/tr/) # => "test str"
s.check_until(/g/)  # => "test string"

[View source]
def check_until(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : String | Nil #

Returns the value that #scan_until would return, without advancing the scan offset. The last match is still saved, however.

require "string_scanner"

s = StringScanner.new("test string")
s.check_until(/tr/) # => "test str"
s.check_until(/g/)  # => "test string"

[View source]
def current_byte : UInt8 #

Returns the current byte at the scan head, and errors if at the end. Does not move the scan head. Does no multi-byte character checking, and may return part of a multi-byte character. See #current_char.


[View source]
def current_byte? : UInt8 | Nil #

Returns the current byte at the scan head, or nil if at the end. Does no multi-byte character checking, and may return part of a multi-byte character. See #current_char?.


[View source]
def current_char : Char #

Returns the character at the scan head, and errors if at the end. Does not move the scan head. This will properly decode the next character from the string, and may return a multi-byte character.


[View source]
def current_char? : Char | Nil #

Returns the character at the scan head, or nil if at the end. Does not move the scan head. This will properly decode the next character from the string, and may return a multi-byte character.


[View source]
def eos? : Bool #

Returns true if the scan offset is at the end of the string.

require "string_scanner"

s = StringScanner.new("this is a string")
s.eos?                # => false
s.scan(/(\w+\s?){4}/) # => "this is a string"
s.eos?                # => true

[View source]
def inspect(io : IO) : Nil #

Writes a representation of the scanner.

Includes the current position of the offset, the total size of the string, and five characters near the current position.


[View source]
def matched? : Bool #

Returns true if the last #scan resulted in a match


[View source]
def offset : Int32 #

Returns the current position of the scan offset.


[View source]
def offset=(position : Int) #

Sets the position of the scan offset.

NOTE Moving the scan head to a non-zero index with this method can cause performance issues in multibyte strings. For a more performant way to move the head, see #skip(Int) or #rewind.


[View source]
def peek(len) : String #

Extracts a string by looking ahead len characters, without advancing the scan offset. The return value has at most len characters, but may have fewer if the scan head is close to the end of the string.


[View source]
def peek_behind(len) : String #

Extracts a string by looking behind len characters, without moving the scan offset. The return value has at most len characters, but may have fewer if the scan head is close to the beginning of the string.


[View source]
def previous_byte : UInt8 #

Returns the byte before the scan head, and errors if at the beginning. Does not move the scan head. This performs no multi-byte checking and may return part of a multi-byte character. See #previous_char


[View source]
def previous_byte? : UInt8 | Nil #

Returns the byte before the scan head, or nil if at the beginning. Does not move the scan head. This performs no multi-byte checking and may return part of a multi-byte character. See #previous_char?.


[View source]
def previous_char : Char #

Returns the character before the scan head, and errors if at the beginning. Does not move the scan head. This will properly decode the previous character from the string, and may return a multi-byte character.


[View source]
def previous_char? : Char | Nil #

Returns the character before the scan head, or nil if at the beginning. Does not move the scan head. This will properly decode the previous character from the string, and may return a multi-byte character.


[View source]
def reset #

Resets the scan offset to the beginning and clears the last match.


[View source]
def rest : String #

Returns the remainder of the string after the scan offset.

require "string_scanner"

s = StringScanner.new("this is a string")
s.scan(/(\w+\s?){2}/) # => "this is "
s.rest                # => "a string"

[View source]
def rewind(len : Int) : Nil #

Rewinds the scan head by len characters.

Raises IndexError if this would go off the beginning of the stream.


[View source]
def scan(pattern : String) : String | Nil #

Tries to match with pattern at the current position. If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the matched string. Otherwise, the scanner returns nil.

require "string_scanner"

s = StringScanner.new("test string")
s.scan(/\w+/)  # => "test"
s.scan(/\w+/)  # => nil
s.scan(/\s\w/) # => " s"
s.scan('t')    # => "t"
s.scan("ring") # => "ring"
s.scan(/.*/)   # => ""

[View source]
def scan(pattern : Char) : String | Nil #

Tries to match with pattern at the current position. If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the matched string. Otherwise, the scanner returns nil.

require "string_scanner"

s = StringScanner.new("test string")
s.scan(/\w+/)  # => "test"
s.scan(/\w+/)  # => nil
s.scan(/\s\w/) # => " s"
s.scan('t')    # => "t"
s.scan("ring") # => "ring"
s.scan(/.*/)   # => ""

[View source]
def scan(len : Int) : String | Nil #

Advances the offset by len chars, and returns a string of that length.

NOTE If there are less than the requested number of characters remaining in the string, this method will return nil and not advance the scan head. To obtain the entire rest of the input string, use #rest.

require "string_scanner"

s = StringScanner.new("あいうえお")
s.scan(3)   # => "あいう"
s.scan(100) # => nil
s.scan(2)   # => "えお"
s.scan(0)   # => ""

[View source]
def scan(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : String | Nil #

Tries to match with pattern at the current position. If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the matched string. Otherwise, the scanner returns nil.

require "string_scanner"

s = StringScanner.new("test string")
s.scan(/\w+/)  # => "test"
s.scan(/\w+/)  # => nil
s.scan(/\s\w/) # => " s"
s.scan('t')    # => "t"
s.scan("ring") # => "ring"
s.scan(/.*/)   # => ""

[View source]
def scan_until(pattern : String) : String | Nil #

Scans the string until the pattern is matched. Returns the substring up to and including the end of the match, the last match is saved, and advances the scan offset. Returns nil if no match.

require "string_scanner"

s = StringScanner.new("test string")
s.scan_until(/ s/) # => "test s"
s.scan_until(/ s/) # => nil
s.scan_until('r')  # => "tr"
s.scan_until("ng") # => "ing"

[View source]
def scan_until(pattern : Char) : String | Nil #

Scans the string until the pattern is matched. Returns the substring up to and including the end of the match, the last match is saved, and advances the scan offset. Returns nil if no match.

require "string_scanner"

s = StringScanner.new("test string")
s.scan_until(/ s/) # => "test s"
s.scan_until(/ s/) # => nil
s.scan_until('r')  # => "tr"
s.scan_until("ng") # => "ing"

[View source]
def scan_until(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : String | Nil #

Scans the string until the pattern is matched. Returns the substring up to and including the end of the match, the last match is saved, and advances the scan offset. Returns nil if no match.

require "string_scanner"

s = StringScanner.new("test string")
s.scan_until(/ s/) # => "test s"
s.scan_until(/ s/) # => nil
s.scan_until('r')  # => "tr"
s.scan_until("ng") # => "ing"

[View source]
def skip(pattern : String) : Int32 | Nil #

Attempts to skip over the given pattern beginning with the scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skipped match. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan, but without returning the matched string.


[View source]
def skip(pattern : Char) : Int32 | Nil #

Attempts to skip over the given pattern beginning with the scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skipped match. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan, but without returning the matched string.


[View source]
def skip(len : Int) : Int32 | Nil #

Advances the offset by len chars.

Prefer this to scanner.offset += len, since that can cause a full scan of the string in the case of multibyte characters.

NOTE If there are less than the requested number of characters remaining in the string, this method will return nil and not advance the scan head. To move the scan head to the very end, use #terminate.


[View source]
def skip(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : Int32 | Nil #

Attempts to skip over the given pattern beginning with the scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skipped match. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan, but without returning the matched string.


[View source]
def skip_until(pattern : String) : Int32 | Nil #

Attempts to skip until the given pattern is found after the scan offset. In other words, the pattern is not anchored to the current scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skip. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan_until, but without returning the matched string.


[View source]
def skip_until(pattern : Char) : Int32 | Nil #

Attempts to skip until the given pattern is found after the scan offset. In other words, the pattern is not anchored to the current scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skip. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan_until, but without returning the matched string.


[View source]
def skip_until(pattern : Regex, *, options : Regex::MatchOptions = Regex::MatchOptions::None) : Int32 | Nil #

Attempts to skip until the given pattern is found after the scan offset. In other words, the pattern is not anchored to the current scan offset.

If there's a match, the scanner advances the scan offset, the last match is saved, and it returns the size of the skip. Otherwise it returns nil and does not advance the offset.

This method is the same as #scan_until, but without returning the matched string.


[View source]
def string : String #

Returns the string being scanned.


[View source]
def terminate #

Moves the scan offset to the end of the string and clears the last match.


[View source]