Ruby uses the REXML library to parse xml format data.
REXML is a processor completely written in ruby. It has multiple APIs, two of which are distinguished by DOM-like and SAX-like. The first is to read the entire file into the memory and store it as a layered form (that is, a tree ). the second type is "parse as you go", which is suitable when your files are large and the memory is limited.
Rexml has the following features:
- 100% written in ruby
- It can be used to parse SAX and DOM
- Lightweight, less than 2000 lines of code
- Provide complete API support
- Ruby built-in
Let's take a look at how to use it. Suppose we have the following xml file:
<collection shelf="New Arrivals"> <movie title="Enemy Behind"> <type>War, Thriller</type> <format>DVD</format> <year>2003</year> <rating>PG</rating> <stars>10</stars> <description>Talk about a US-Japan war</description> </movie> <movie title="Transformers"> <type>Anime, Science Fiction</type> <format>DVD</format> <year>1989</year> <rating>R</rating> <stars>8</stars> <description>A schientific fiction</description> </movie> <movie title="Trigun"> <type>Anime, Action</type> <format>DVD</format> <episodes>4</episodes> <rating>PG</rating> <stars>10</stars> <description>Vash the Stampede!</description> </movie> <movie title="Ishtar"> <type>Comedy</type> <format>VHS</format> <rating>PG</rating> <stars>2</stars> <description>Viewable boredom</description> </movie></collection>
Parse DOM:
require 'rexml/document'include REXMLxmlfile = File.new("movies.xml")xmldoc = Document.new(xmlfile)root = xmldoc.rootputs "Root element : " + root.attributes["shelf"]xmldoc.elements.each("collection/movie"){ |e| puts "Movie Title : " + e.attributes["title"]}xmldoc.elements.each("collection/movie/type") { |e| puts "Movie Type : " + e.text}xmldoc.elements.each("collection/movie/description") { |e| puts "Movie Description : " + e.text}
Use XPATH:
require 'rexml/document'include REXMLxmlfile = File.new("movies.xml")xmldoc = Document.new(xmlfile)movie = XPath.first(xmldoc, "//movie")p movieXPath.each(xmldoc, "//type") { |e| puts e.text }names = XPath.match(xmldoc, "//format").map {|x| x.text }p names
To save time!
PS: REXML security issues
The Ruby official website published a security notice on June 14, August 23: Success!
All XML file parsing functions used in Rails Applications have the preceding defects and need to be fixed. The solution in Rails is as follows:
1. Rails2.0.2 and earlier versions
Download the repair file, copy it to the RAILS_ROOT/lib directory, and add the statement to environment. rb.
require ‘rexml-expansion-fix'
2. Rails 2.1.0 and later versions
Download the repair file and copy it to the RAILS_ROOT/config/initializers directory.
Articles you may be interested in:
- How to Use the XML data processing database rexml in Ruby
- Example parsing the usage of calling REXML In the Ruby program to parse XML format data
- How to create and parse XML files in Ruby programs
- A simple tutorial on XML, XSLT, and XPath processing in Ruby
- Tutorial on using Nokogiri package to operate XML format data in Ruby