module HTML

Overview

Provides HTML escaping and unescaping methods.

For HTML parsing see module XML, especially XML.parse_html.

Defined in:

html/entities.cr
html.cr

Constant Summary

CHARACTER_REPLACEMENTS = {'€', '\u{81}', '‚', 'ƒ', '„', '…', '†', '‡', 'ˆ', '‰', 'Š', '‹', 'Œ', '\u{8d}', 'Ž', '\u{8f}', '\u{90}', '‘', '’', '“', '”', '•', '–', '—', '˜', '™', 'š', '›', 'œ', '\u{9d}', 'ž', 'Ÿ'}

These replacements permit compatibility with old numeric entities that assumed Windows-1252 encoding. http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference

SUBSTITUTIONS = {'&' => "&amp;", '<' => "&lt;", '>' => "&gt;", '"' => "&quot;", '\'' => "&#39;"}

Class Method Summary

Class Method Detail

def self.escape(string : String, io : IO) : Nil #

Same as .escape(string) but ouputs the result to the given io.

io = IO::Memory.new
HTML.escape("Crystal & You", io) # => nil
io.to_s                          # => "Crystal &amp; You"

[View source]
def self.escape(string : String) : String #

Escapes special characters in HTML, namely &, <, >, " and '.

require "html"

HTML.escape("Crystal & You") # => "Crystal &amp; You"

[View source]
def self.unescape(string : String) : String #

Returns a string where named and numeric character references (e.g. >, >, &x3e;) in string are replaced with the corresponding unicode characters. This method decodes all HTML5 entities including those without a trailing semicolon (such as ©).

HTML.unescape("Crystal &amp; You") # => "Crystal & You"

[View source]