Tutorial on using Nokogiri package to operate XML format data in Ruby, rubynokogiri
Install
For Ubuntu, you need to install libxml2 and libxslt:
$ apt-get install libxml2 libxslt
Then you can:
$ gem install nokogiri
Optional
Nokogiri provides some options for parsing files. Common options include:
- Noblks: delete empty nodes
- NOENT: substitution entity
- NOERROR: Hide the error report
- STRICT: precise parsing. An error is thrown when an exception occurs during file parsing.
- NONET: Disables any network connection during resolution.
Optional usage example (called by block ):
doc = Nokogiri::XML(File.open("blossom.xml")) do |config|config.strict.nonetend
Or
doc = Nokogiri::XML(File.open("blossom.xml")) do |config|config.options = Nokogiri::XML::ParseOptions::STRICT | Nokogiri::XML::ParseOptions::NONETend
Analysis
It can be parsed from files, strings, URLs, etc. These two methods depend on Nokogiri: HTML, Nokogiri: XML:
Read string:
html_doc = Nokogiri::HTML("
Read files:
f = File.open("blossom.xml")doc = Nokogiri::XML(f)f.close
Read URL:
require 'open-uri'doc = Nokogiri::HTML(open("http://www.threescompany.com/"))
Search for nodes
You can use XPATH and CSS selector to search: for example, to specify an XML:
<books> <book> <title>Stars</title> </book> <book> <title>Moon</title> </book></books>
Xpath:
@doc.xpath("//title")
Css:
@doc.css("book title")
Modify node content
title = @doc.css("book title").firstotitle.content = 'new title'puts @doc.to_html# =>... <title>new title</title>...
Modify node Structure
First_title = @ doc. at_css ('title') second_book = @doc.css ('book '). last # You can place the first title in the second book first_title.parent = second_book. Second_book.add_next_sibling (first_title) # You can also modify the corresponding classfirst_title.name = 'h2 'first _ title ['class'] = 'Red _ color' puts @ doc. to_html #=>
Articles you may be interested in:
- How to Use the XML data processing database rexml in Ruby
- Example parsing the usage of calling REXML In the Ruby program to parse XML format data
- Ruby uses the REXML library to parse xml format data
- How to create and parse XML files in Ruby programs
- A simple tutorial on XML, XSLT, and XPath processing in Ruby