<?xml version="1.0"?><camelids> <species name="Camelus dromedarius"> <common-name>Dromedary, or Arabian Camel</common-name> <physical-characteristics> <mass>300 to 690 kg.</mass> <appearance> The dromedary camel is characterized by a long-curved neck, deep-narrow chest, and a single hump. ... </appearance> </physical-characteristics> <natural-history> <food-habits> The dromedary camel is an herbivore. ... </food-habits> <reproduction> The dromedary camel has a lifespan of about 40-50 years ... </reproduction> <behavior> With the exception of rutting males, dromedaries show very little aggressive behavior. ... </behavior>
Now let's assume that this complete document (which can be obtained from the example code of this month) contains all the information of all the members of the camel family, not just the single-peak information above. To illustrate how each module extracts a subset of data from this file, we will write a very short script to process camelids. the XML document and stdout output the common-name, Latin name (wrapped in parentheses) of each type, and the current storage status. Therefore, to process the complete document, the output of each script should be as follows:
Bactrian Camel (Camelus bactrianus) endangered Dromedary, or Arabian Camel (Camelus dromedarius) no special status Llama (Lama glama) no special status Guanaco (Lama guanicoe) special concernVicuna (Vicugna vicugna) endangered
[Edit] Task 2: Create an XML document
To demonstrate how each module creates XML documents from other data sources, we will write a small script to convert a simple Perl hash into a simple XHTML document. Hash contains the URLs of webpages that point to specific cool-related camels.
Hash:
my %camelid_links = ( one => { url => 'http://www.online.discovery.com/news/picture/may99/photo20.html', description => 'Bactrian Camel in front of Great ' . 'Pyramids in Giza, Egypt.'}, two => { url => 'http://www.fotos-online.de/english/m/09/9532.htm', description => 'Dromedary Camel illustrates the ' . 'importance of accessorizing.'}, three => { url => 'http://www.eskimo.com/~wallama/funny.htm', description => 'Charlie - biography of a narcissistic llama.'}, four => { url => 'http://arrow.colorado.edu/travels/other/turkey.html', description => 'A visual metaphor for the perl5-porters ' . 'list?'}, five => { url => 'http://www.galaonline.org/pics.htm', description => 'Many cool alpacas.'}, six => { url => 'http://www.thpf.de/suedamerikareise/galerie/vicunas.htm', description => 'Wild Vicunas in a scenic landscape.'});
The document example we want to create from hash is:
<?xml version="1.0">
A well-indented XML result file (as shown above) is very important for reading, but this good space processing is not required in our case. What we care about is that the result document is well structured/well-formed and it correctly shows the data in the hash.
After the task is defined, it is time for the code example.
[Edit] Special XML Perl interface Example
[Edit] XML: simple
XML: simple, which was originally created to simplify reading and writing XML format configuration files. There is no other abstract interface between the conversion of XML documents and Perl data structures. All elements and attributes can be directly read through nested references.
[Edit] Read
use XML::Simple;my $file = 'files/camelids.xml';my $xs1 = XML::Simple->new();my $doc = $xs1->XMLin($file);foreach my $key (keys (%{$doc->{species}})){ print $doc->{species}->{$key}->{'common-name'} . ' (' . $key . ') '; print $doc->{species}->{$key}->{conservation}->final . "/n";}
[Edit] Write
use XML::Simple;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $xsimple = XML::Simple->new();print $xsimple->XMLout(/%camelid_links, noattr => 1, xmldecl => '<?xml version="1.0">');
The condition requirements for data-to-document tasks expose a weakness of XML: simple: it does not allow us to determine which key in the hash should be returned as an element and which key should be returned as an attribute. Although the output in the above example is close to our output requirements, it is far from enough. For cases that prefer to operate the XML document content directly as a Perl data structure and require more detailed control over the output, XML: simple and XML: writer work well together. The following example shows how to use XML: Write to meet our output requirements.
use XML::Writer;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $writer = XML::Writer->new();$writer->xmlDecl();$writer->startTag('html');$writer->startTag('body');foreach my $item ( keys (%camelid_links) ) { $writer->startTag('a', 'href' => $camelid_links{$item}->{url}); $writer->characters($camelid_links{$item}->{description}); $writer->endTag('a');}$writer->endTag('body');$writer->endTag('html');$writer->end();
[Edit] XML: simpleobject
XML: simpleobject uses the accessor/accessor of the Document Object Model for XML data to provide an interface to face objects.
[Edit] Read
use XML::Parser;use XML::SimpleObject;my $file = 'files/camelids.xml';my $parser = XML::Parser->new(ErrorContext => 2, Style => "Tree");my $xso = XML::SimpleObject->new( $parser->parsefile($file) );foreach my $species ($xso->child('camelids')->children('species')) { print $species->child('common-name')->{VALUE}; print ' (' . $species->attribute('name') . ') '; print $species->child('conservation')->attribute('status'); print "/n";}
[Edit] Write
XML: simpleobject does not support XML document creation by capturing. However, like the XML: simple example above, you can simply complete the task with XML: writer.
[Edit] XML: treebuilder
The XML: treebuilder package consists of two modules: XML: element used to create and obtain the content and XML of XML Element points :: as a factory package, treebuilder simplifies document tree creation from existing XML files. For those who already have experience using the HTML: element and HTML: Tree modules, it is very easy to use XML: treebuilder, because all methods except XML are the same.
[Edit] Read
use XML::TreeBuilder;my $file = 'files/camelids.xml';my $tree = XML::TreeBuilder->new();$tree->parse_file($file);foreach my $species ($tree->find_by_tag_name('species')){ print $species->find_by_tag_name('common-name')->as_text; print ' (' . $species->attr_get_i('name') . ') '; print $species->find_by_tag_name('conservation')->attr_get_i('status'); print "/n";}
[Edit] Write
use XML::Element;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Element->new('html');my $body = XML::Element->new('body');my $xml_pi = XML::Element->new('~pi', text => 'xml version="1.0"');$root->push_content($body);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Element->new('a', 'href' => $camelid_links{$item}->{url}); $link->push_content($camelid_links{$item}->{description}); $body->push_content($link);}print $xml_pi->as_XML;print $root->as_XML();
[Edit] XML: twig
XML: twig is different from other XML interfaces with only Perl. It is a perlish interface with creative features besides standard XML APIs. For more details, see the xml.com article.
[Edit] Read
use XML::Twig;my $file = 'files/camelids.xml';my $twig = XML::Twig->new();$twig->parsefile($file);my $root = $twig->root;foreach my $species ($root->children('species')){ print $species->first_child_text('common-name'); print ' (' . $species->att('name') . ') '; print $species->first_child('conservation')->att('status'); print "/n";}
[Edit] Write
use XML::Twig;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Twig::Elt->new('html');my $body = XML::Twig::Elt->new('body');$body->paste($root);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Twig::Elt->new('a'); $link->set_att('href', $camelid_links{$item}->{url}); $link->set_text($camelid_links{$item}->{description}); $link->paste('last_child', $body);}print qq|<?xml version="1.0"?>|;$root->print;
These examples illustrate the basic usage of these common XML Perl modules. My goal is to provide enough examples to show you how to use each module to write code. Next month, we will focus on "Implementing a standard xml api module". In particular, XML: Dom, XML: XPath, and a large number of other Sax and sax-like modules.
[Edit] Resource