The method for creating and parsing XML files in Ruby programs, rubyxml

Source: Internet
Author: User

The method for creating and parsing XML files in Ruby programs, rubyxml

Use builder to create XML

Builder installation method:

gem install builder
Require 'builder 'X = builder: XmlMarkup. new (: target => $ stdout,: indent => 1) # ": target => $ stdout" parameter: indicates that the output content will be written to the standard output console #": indent => 1 "parameter: XML output format will be indented by a space character x. instruct! : Xml,: version => '1. 1',: encoding => 'gb2312' x. comment! "Book information" x. library ("shelf" => "Recent Acquisitions") {x. section ("name" => "ruby") {x. book ("isbn" => "0672310001") {x. title "Programming Ruby" x. author "Yukihiro" x. description "Programming Ruby-The Pragmatic Programmer's Guide "}}}

P x # print XML

Ruby creates XML output results:

<? Xml version = "1.1" encoding = "gb2312"?> <! -- Book information --> <library shelf = "Recent Acquisitions"> <section name = "ruby"> <book isbn = "0672310001"> <title> Programming Ruby </title> <author> Yukihiro </author> <description> Programming Ruby-The Pragmatic Programmer's Guide </description> </book> </section> </library> <inspect/> # <IO: 0x2a06ae8>

Use ReXML to parse XML

REXML is a processor completely written in ruby. It has multiple APIs, two of which are distinguished by DOM-like and SAX-like. The first is to read the entire file into the memory and store it as a layered form (that is, a tree ). the second type is "parse as you go", which is suitable when your files are large and the memory is limited.

See the following book. xml:

Reference

<library shelf="Recent Acquisitions">   <section name="Ruby">     <book isbn="0672328844">     <title>The Ruby Way</title>     <author>Hal Fulton</author>     <description>       Second edition. The book you are now reading.       Ain't recursion grand?     </description>     </book>   </section>   <section name="Space">     <book isbn="0684835509">       <title>The Case for Mars</title>       <author>Robert Zubrin</author>       <description>Pushing toward a second home for the human         race.       </description>     </book>     <book isbn="074325631X">       <title>First Man: The Life of Neil A. Armstrong</title>       <author>James R. Hansen</author>       <description>Definitive biography of the first man on         the moon.       </description>     </book>   </section> </library>


1 Tree Parsing (that is, DOM-like)

We need the require rexml/document library and include REXML:

require 'rexml/document' include REXML  input = File.new("books.xml") doc = Document.new(input)  root = doc.root puts root.attributes["shelf"]   # Recent Acquisitions  doc.elements.each("library/section") { |e| puts e.attributes["name"] } # Output: #  Ruby #  Space  doc.elements.each("*/section/book") { |e| puts e.attributes["isbn"] } # Output: #  0672328844 #  0321445619 #  0684835509 #  074325631X  sec2 = root.elements[2] author = sec2.elements[1].elements["author"].text    # Robert Zubrin 


Note that the attribute and value in xml are represented as a hash, so we can extract the value we need through attributes, the element value can also be obtained through a string or integer similar to path. if an integer is used, the value is 1-based instead of 0-based.

2 Stream Parsing (that is, SAX-like Parsing)

Here we use a small trick, that is, to define a listener class, which will be called back during parse:

require 'rexml/document' require 'rexml/streamlistener' include REXML  class MyListener  include REXML::StreamListener  def tag_start(*args)   puts "tag_start: #{args.map {|x| x.inspect}.join(', ')}"  end   def text(data)   return if data =~ /^\w*$/   # whitespace only   abbrev = data[0..40] + (data.length > 40 ? "..." : "")   puts " text  :  #{abbrev.inspect}"  end end  list = MyListener.new source = File.new "books.xml" Document.parse_stream(source, list) 


Here we will introduce the StreamListener module, which provides several empty callback methods, so you can overwrite it to implement your own functions. when parser enters a tag, the tag_start method is called. the text method is similar, but it is called back when the data is read. Its output is as follows:

tag_start: "library", {"shelf"=>"Recent Acquisitions"} tag_start: "section", {"name"=>"Ruby"} tag_start: "book", {"isbn"=>"0672328844"} tag_start: "title", {}  text  :  "The Ruby Way" ......................................... 


3 XPath

REXML supports XPath through the Xpath class. It also supports DOM-like and SAX-like. Or the xml file above. We can do this using Xpath:

book1 = XPath.first(doc, "//book")  # Info for first book found p book1  # Print out all titles XPath.each(doc, "//title") { |e| puts e.text }  # Get an array of all of the "author" elements in the document. names = XPath.match(doc, "//author").map {|x| x.text } p names 


The output is similar to the following:

<book isbn='0672328844'> ... </> The Ruby Way The Case for Mars First Man: The Life of Neil A. Armstrong ["Hal Fulton", "Robert Zubrin", "James R. Hansen"] 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.