Installation
For Ubuntu, you need to install LIBXML2, libxslt these two components:
$ apt-get Install LIBXML2 libxslt
Then you can:
Available options
Nokogiri provides some options for parsing files, which are commonly used:
- Noblanks: Delete Empty node
- Noent: Alternative Entities
- NoError: Hide Error Reporting
- STRICT: Accurate parsing, throwing an error when parsing to a file exception
- Nonet: Disables any network connections during parsing.
Options can be used as examples (via block invocation):
doc = Nokogiri::xml (File.Open ("Blossom.xml")) do |config|
Config.strict.nonet End
Or
doc = Nokogiri::xml (File.Open ("Blossom.xml")) do |config|
Config.options = nokogiri::xml::P arseoptions::strict | Nokogiri::xml::P arseoptions::nonet End
Analytical
Can be parsed from files, strings, URLs, and so on. By these two methods nokogiri::html, Nokogiri::xml:
Read string:
Html_doc = nokogiri::html ("
Read file:
f = File.Open ("Blossom.xml")
doc = Nokogiri::xml (f)
F.close
Read URL:
Require ' Open-uri '
doc = nokogiri::html (open ("http://www.threescompany.com/"))
Finding nodes
You can search using XPath and CSS selector: for example, given an XML:
<books>
<book>
<title>Stars</title>
</book>
<book>
<title>Moon</title>
</book>
</books>
Xpath:
Css:
Modify Node Contents
title = @doc. CSS ("book title") Firsto
title.content = ' new title '
puts @doc. to_html
# =>
...
<title>new title</title>
...
Modifying the structure of a node
First_title = @doc. At_css (' title ')
Second_book = @doc. css (' book '). Last
# You can put the first title in the second book
First_ Title.parent = Second_book
# can also be placed freely.
second_book.add_next_sibling (first_title)
# can also modify the corresponding class
first_title.name = ' H2 '
first_title[' Class ']= ' Red_color '
puts @doc. to_html # =>