A class that parses html into a class structure for easy manipulation, or processing.
A class that parses html into a class structure for easy manipulation, or processing.
The HTMLParser class parses HTML using a simple algorithm in one pass into a simple class structure. It handles erroneous HTML documents relatively well. It does not know and check HTML tag validity, it will just handles all of them in a generic way.
A String containing HTML data can ba parsed using it's parse() method. The resulting class structure can then be accessed via it's root member variable.
[code]var p : HTMLParser = HTMLParser.new()
p.parse(data)
var article_tag : HTMLParserTag = p.root.get_first("article")
save_data(article_tag.to_string())
var n_link_tag : HTMLParserTag = p.root.get_first("a", "rel", "next")
String next_link = n_link_tag.get_attribute_value("href")[/code]
The HTML document can be turned back into well formatted HTML using this method.
The resulting root [HTMLParserTag].
Parses the given data as html.
Equivalent to [code]print(convert_to_string())[/code].