Rexml is a library written by Sean Russell. It is not Ruby's only XML library, but it is a popular one and is written in pure Ruby (Nqxml is also written in Ruby, but Xmlparser encapsulates the Jade library written in C). In his Rexml overview, Russell commented:
I have this problem: I don't like confusing APIs. There are several XML parser APIs for Java implementations. Most of them follow DOM or SAX, and are very similar in principle to the many Java APIs that continue to appear. That is, they look like they were designed by theorists who have never used their own APIs. In general, the existing XML APIs are annoying. They use a markup language that is clearly designed to be very simple, first-class, and powerful, and then encapsulate it with annoying, excessive, and large APIs. Even for the most basic XML tree operations, I always have to refer to API documentation; Nothing is intuitive, and almost every operation is complicated.
Although I don't think it's annoying, I agree with Russell's view that XML APIs are a lot of work for most people who use them.
Example
look at the following book.xml:
Reference
<library shelf= "recent acquisitions" > <section name= "Ruby" > <book ISB n= "0672328844" > <title>the Ruby way</title> <author>hal fulton</author> <descriptio N> Second Edition.
The book is are now reading.
Ain ' t recursion grand?
</description> </book> </section> <section name= "Space" > <book isbn= "0684835509" > <title>the Case for mars</title> <author>robert zubrin</author> <description>Pus
Hing toward a second home for the human race. </description> </book> <book isbn= "074325631X" > <title>first man:the Life of Neil A. Armstrong</title> <author>james R. hansen</author> <description>definitive biography of th
E-Man on the moon. </description> </book> </section> </library>
1 tree parsing (i.e. dom-like)
We need require rexml/document libraries, and include Rexml:
Require ' rexml/document '
include rexml
input = file.new ("books.xml")
doc = document.new (input)
root = Doc.root
puts root.attributes[shelf] # Recent acquisitions Doc.elements.each
("Library/section") {|e| Puts e.attributes["name"}
# Output:
# Ruby
# space
Doc.elements.each ("*/section/book") {|e| puts E.attributes["ISBN"]}
# Output:
# 0672328844
# 0321445619 #
0684835509 #
074325631X
SEC2 = root.elements[2]
author = sec2.elements[1].elements["Author"].text # Robert Zubrin
The note here is that the attributes and values in XML are represented as a hash, so we can extract the values we need through attributes[]. The value of an element can also be obtained by means of a string or an integer similar to path. It is 1-based rather than 0-based, with integers.
2 Stream parsing (i.e. Sax-like parsing)
Here's a little trick that defines a listener class that will be called back at parse:
Require ' rexml/document '
require ' Rexml/streamlistener '
include Rexml
class MyListener
include Rexml::streamlistener
def tag_start (*args)
puts "Tag_start: #{args.map {|x| X.inspect}.join (', ')} "
End
def text (data) return to
if data =~/^\w*$/ # whitespace only
abbrev = DATA[0..40] + (Data.length > 40?) "...": "")
puts "text: #{abbrev.inspect}" end end
list = mylistener.new
Source = file.new "bo Oks.xml "
Document.parse_stream" (Source, list)
Here's an introduction to the Streamlistener module, which provides a few empty callback methods, so you can override it to achieve your own functionality. When parser enters a tag, it calls Tag_ The Start method. And the text method is similar, he only is when reads the data to be recalled, its output is this:
Tag_start: "Library", {"shelf" => "recent Acquisitions"}
Tag_start: "section", {"name" => "Ruby"}
tag_ Start: "book", {"ISBN" => "0672328844"}
Tag_start: "title", {}
text: "The Ruby Way"
3 XPath
Rexml provides XPath support through an XPath class. It also supports Dom-like and sax-like. or the previous XML file, which we use XPath to do:
Book1 = Xpath.first (Doc, "//book") # Info for a-i-found
p Book1
# Print out all titles
Xpath.each (Doc, "//title") {|e| puts e.text}
# get an array of the ' author ' elements in the document.
names = Xpath.match (Doc, "//author"). Map {|x| X.text}
p names
The output is similar to the following:
<book isbn= ' 0672328844 ' > ... </> the Ruby Way the case for
Mars
A-man:the life of Neil A. Rmstrong
["Hal Fulton", "Robert Zubrin", "James R. Hansen"]