- Translation: fayland
- Exit: Chinese Perl Association FPC-perlchina.org
- Original Name: Perl XML Quickstart: the Perl XML Interfaces
- Chinese name
- Author: Kip Hampton
- Original article: http://www.xml.com/pub/a/2001/04/18/perlxmlqstart1.html
- Output table:
- Perlchina reminds you: please protect the copyright of the author and maintain the crystallization of the work of the author.
Directory [Hide]
- 1. Getting Started
- 2. Task
- 3. Task 1: extract data
- 4. Task 2: Create an XML document
- 5 example of special XML Perl Interface
- 5.1 XML: simple
- 5.2 XML: simpleobject
- 5.3 XML: treebuilder
- 5.4 XML: twig
- 6. Resources
|
[Edit] Getting Started
Recently, the perl-XML contact list frequently asked how to give unfamiliar users a quick introduction to a large number of Perl XML modules. In the next few months, I will write several column articles on this issue. The XML module on CPAN can be divided into three categories: A unique interface for XML data (usually related to the conversion between XML instances and Perl data) to implement a standard xml api module, and Special-purpose modules that simplify some specific XML-related tasks. This month, we will first focus on the first specialized XML Perl interface.
Use disclaimer QW (: Standard );
This document is not intended to benchmark the performance of a module, nor is it intended to imply that a module is more useful than another module. Choosing the correct XML module for your project depends more on the project itself and your accumulated experience. Different interfaces are suitable for different tasks and different people. My only purpose is to display how to get the same final result by defining two simple tasks and then providing runtime examples with different excuses.
[Edit] task
Although XML is widely used, most XML-related tasks can be divided into two groups: one is to extract data from existing XML documents, the other is to use data from other resources to create a new XML document. In this case, the examples used to introduce different modules will be composed of "extracting a specific dataset from an XML file" and "converting a Perl data structure into a specific XML format.
[Edit] Task 1: extract data
First, assume there are the following XML snippets:
<?xml version="1.0"?><camelids> <species name="Camelus dromedarius"> <common-name>Dromedary, or Arabian Camel</common-name> <physical-characteristics> <mass>300 to 690 kg.</mass> <appearance> The dromedary camel is characterized by a long-curved neck, deep-narrow chest, and a single hump. ... </appearance> </physical-characteristics> <natural-history> <food-habits> The dromedary camel is an herbivore. ... </food-habits> <reproduction> The dromedary camel has a lifespan of about 40-50 years ... </reproduction> <behavior> With the exception of rutting males, dromedaries show very little aggressive behavior. ... </behavior>
Now let's assume that this complete document (which can be obtained from the example code of this month) contains all the information of all the members of the camel family, not just the single-peak information above. To illustrate how each module extracts a subset of data from this file, we will write a very short script to process camelids. the XML document and stdout output the common-name, Latin name (wrapped in parentheses) of each type, and the current storage status. Therefore, to process the complete document, the output of each script should be as follows:
Bactrian Camel (Camelus bactrianus) endangered Dromedary, or Arabian Camel (Camelus dromedarius) no special status Llama (Lama glama) no special status Guanaco (Lama guanicoe) special concernVicuna (Vicugna vicugna) endangered
[Edit] Task 2: Create an XML document
To demonstrate how each module creates XML documents from other data sources, we will write a small script to convert a simple Perl hash into a simple XHTML document. Hash contains the URLs of webpages that point to specific cool-related camels.
Hash:
my %camelid_links = ( one => { url => 'http://www.online.discovery.com/news/picture/may99/photo20.html', description => 'Bactrian Camel in front of Great ' . 'Pyramids in Giza, Egypt.'}, two => { url => 'http://www.fotos-online.de/english/m/09/9532.htm', description => 'Dromedary Camel illustrates the ' . 'importance of accessorizing.'}, three => { url => 'http://www.eskimo.com/~wallama/funny.htm', description => 'Charlie - biography of a narcissistic llama.'}, four => { url => 'http://arrow.colorado.edu/travels/other/turkey.html', description => 'A visual metaphor for the perl5-porters ' . 'list?'}, five => { url => 'http://www.galaonline.org/pics.htm', description => 'Many cool alpacas.'}, six => { url => 'http://www.thpf.de/suedamerikareise/galerie/vicunas.htm', description => 'Wild Vicunas in a scenic landscape.'});
The document example we want to create from hash is:
<?xml version="1.0">
A well-indented XML result file (as shown above) is very important for reading, but this good space processing is not required in our case. What we care about is that the result document is well structured/well-formed and it correctly shows the data in the hash.
After the task is defined, it is time for the code example.
[Edit] Special XML Perl interface Example
[Edit] XML: simple
XML: simple, which was originally created to simplify reading and writing XML format configuration files. There is no other abstract interface between the conversion of XML documents and Perl data structures. All elements and attributes can be directly read through nested references.
[Edit] Read
use XML::Simple;my $file = 'files/camelids.xml';my $xs1 = XML::Simple->new();my $doc = $xs1->XMLin($file);foreach my $key (keys (%{$doc->{species}})){ print $doc->{species}->{$key}->{'common-name'} . ' (' . $key . ') '; print $doc->{species}->{$key}->{conservation}->final . "/n";}
[Edit] Write
use XML::Simple;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $xsimple = XML::Simple->new();print $xsimple->XMLout(/%camelid_links, noattr => 1, xmldecl => '<?xml version="1.0">');
The condition requirements for data-to-document tasks expose a weakness of XML: simple: it does not allow us to determine which key in the hash should be returned as an element and which key should be returned as an attribute. Although the output in the above example is close to our output requirements, it is far from enough. For cases that prefer to operate the XML document content directly as a Perl data structure and require more detailed control over the output, XML: simple and XML: writer work well together. The following example shows how to use XML: Write to meet our output requirements.
use XML::Writer;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $writer = XML::Writer->new();$writer->xmlDecl();$writer->startTag('html');$writer->startTag('body');foreach my $item ( keys (%camelid_links) ) { $writer->startTag('a', 'href' => $camelid_links{$item}->{url}); $writer->characters($camelid_links{$item}->{description}); $writer->endTag('a');}$writer->endTag('body');$writer->endTag('html');$writer->end();
[Edit] XML: simpleobject
XML: simpleobject uses the accessor/accessor of the Document Object Model for XML data to provide an interface to face objects.
[Edit] Read
use XML::Parser;use XML::SimpleObject;my $file = 'files/camelids.xml';my $parser = XML::Parser->new(ErrorContext => 2, Style => "Tree");my $xso = XML::SimpleObject->new( $parser->parsefile($file) );foreach my $species ($xso->child('camelids')->children('species')) { print $species->child('common-name')->{VALUE}; print ' (' . $species->attribute('name') . ') '; print $species->child('conservation')->attribute('status'); print "/n";}
[Edit] Write
XML: simpleobject does not support XML document creation by capturing. However, like the XML: simple example above, you can simply complete the task with XML: writer.
[Edit] XML: treebuilder
The XML: treebuilder package consists of two modules: XML: element used to create and obtain the content and XML of XML Element points :: as a factory package, treebuilder simplifies document tree creation from existing XML files. For those who already have experience using the HTML: element and HTML: Tree modules, it is very easy to use XML: treebuilder, because all methods except XML are the same.
[Edit] Read
use XML::TreeBuilder;my $file = 'files/camelids.xml';my $tree = XML::TreeBuilder->new();$tree->parse_file($file);foreach my $species ($tree->find_by_tag_name('species')){ print $species->find_by_tag_name('common-name')->as_text; print ' (' . $species->attr_get_i('name') . ') '; print $species->find_by_tag_name('conservation')->attr_get_i('status'); print "/n";}
[Edit] Write
use XML::Element;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Element->new('html');my $body = XML::Element->new('body');my $xml_pi = XML::Element->new('~pi', text => 'xml version="1.0"');$root->push_content($body);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Element->new('a', 'href' => $camelid_links{$item}->{url}); $link->push_content($camelid_links{$item}->{description}); $body->push_content($link);}print $xml_pi->as_XML;print $root->as_XML();
[Edit] XML: twig
XML: twig is different from other XML interfaces with only Perl. It is a perlish interface with creative features besides standard XML APIs. For more details, see the xml.com article.
[Edit] Read
use XML::Twig;my $file = 'files/camelids.xml';my $twig = XML::Twig->new();$twig->parsefile($file);my $root = $twig->root;foreach my $species ($root->children('species')){ print $species->first_child_text('common-name'); print ' (' . $species->att('name') . ') '; print $species->first_child('conservation')->att('status'); print "/n";}
[Edit] Write
use XML::Twig;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Twig::Elt->new('html');my $body = XML::Twig::Elt->new('body');$body->paste($root);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Twig::Elt->new('a'); $link->set_att('href', $camelid_links{$item}->{url}); $link->set_text($camelid_links{$item}->{description}); $link->paste('last_child', $body);}print qq|<?xml version="1.0"?>|;$root->print;
These examples illustrate the basic usage of these common XML Perl modules. My goal is to provide enough examples to show you how to use each module to write code. Next month, we will focus on "Implementing a standard xml api module". In particular, XML: Dom, XML: XPath, and a large number of other Sax and sax-like modules.
[Edit] Resource
- Download case code
- Complete list of XML modules on CPAN
- Perl-XML contact list file
- Use XML: twig
- Second part: Perl XML Quickstart: The standard XML Interfaces