Ruby XML 外掛程式的效能大比拼: Nokogiri vs LibXML vs Hpricot vs REXML

來源:互聯網
上載者:User

Disclaimer: Every time we've run a piece about benchmarking or performance numbers on Ruby Inside, a retraction or significant correction has come out shortly thereafter. Benchmarking is hard, ugly, and quite often wrong or biased. It is not useless, however, but if you depend on the results in any way, you should certainly try to do your own benchmarking to confirm.

Last week, libxml-ruby 1 was released - a significant achievement since it had been under development for seven years. I suspectedthat it might just pip Nokogiri to the "fastest way to parse XML in Ruby" post and invited people to benchmark them. Turns out.. it ain't so. Nokogiri is the fastest.

Aaron Peterson, the developer behind Nokogiri, decided to run some tests and he's published the results in a dossier called xml_truth. The benchmarking environment was Ruby 1.8.6 on OS X 10.5 with libxml2 2.7.3 installed. Hpricot 0.6.170 competes against Nokogiri 1.2.2, LibXML-Ruby 1.1.2, and the standard library's REXML.

Aaron put together a whole suite of benchmarks, but if you just want an overview, here's a chart showing the results for in memory parsing of a 14 megabyte XML document. Note that the parsing time is in seconds and the Y-axis is logarithmic. Yes, Hpricot took over a minute, and REXML took over two minutes, while Nokogiri and libxml-ruby came in at a few seconds each:

Want all the actual numbers? Want to see the actual tests? Want to run them for yourself? Head over to Aaron's project here. Have fun! I'll putting on my flame-retardant jacket in preparation for all of the fallout about how inaccurate these tests are in 3.. 2.. 1..

Update: Hpricot 0.7 has been released and Patrick Tulskie has run some extra benchmarks.These results show that Hpricot beats libxml-ruby and Nokogiri under certain circumstances (quite significantly under an XPath test).

Update 2 (March 22, 2009): libxml-ruby's developer Charles Savage has found why libxml-ruby lagged behind Nokogiri and has resolved it.. :)

Support from: Brightbox; - Europe's leading provider of Ruby on Rails hosting. Now with Phusion Passenger support, each Brightbox server comes with access to a managed MySQL cluster and redundant SAN storage. Brightbox also provides managed services for large scale applications and dedicated clusters.

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.