Parse XML files with Perl XML: simple

Source: Internet
Author: User

The most common method for parsing XML in Perl is to use XML: Dom and XML: simple. XML: The Dom is too large and the parsing result is a DOM tree, which is inconvenient to operate. For small and non-complex XML files, XML: Dom is a cool tool. Now it's the turn of the lightweight XML: simple.

XML: simple is as simple as its name. Assume that the XML content is as follows:

<opt>
    <user login = "grep" fullname = "Gary R Epstein" />
    <user login = "stty" fullname = "Simon T Tyson">
        <session pid = "12345" />
    </ user>
    <text> This is a test. </ text>
</ opt>
Then just write:

use XML :: Simple;
use Data :: Dumper;

$ xml = XMLin ('sample.xml');
print Dumper ($ xml);
You can easily parse the XML into a hash, and then use foreach to process it in turn.

 

$ VAR1 = {
          'text' => 'This is a test.',
          'user' => [
                    {
                      'fullname' => 'Gary R Epstein',
                      'login' => 'grep'
                    },
                    {
                      'session' => {
                                   'pid' => '12345'
                                 },
                      'fullname' => 'Simon T Tyson',
                      'login' => 'stty'
                    }
                  ]
        };
The following laws can be found:

The tag name of the element is used as the hash key.
The content of a single element is used as the value of the hash, and the content of multiple repeated elements is placed in an array reference as the value of the hash
Attributes and subelements appear in the content of the element as hash key => value pairs
One problem is that the inconsistent results of processing a single element and multiple repeating elements will cause foreach processing to be more troublesome (need to distinguish between scalar and array references), such as the value of text and user above. The solution is to add the option ForceArray => 1, you can force a single element to be placed in the array reference.

$ xml = XMLin ('sample.xml', ForceArray => 1);
print Dumper ($ xml);
Operation result (part):

$ VAR1 = {
          'text' => [
                    'This is a test.'
                  ],
          'user' => [
...
Another problem is that if your element attribute contains id, name or key, then the element is no longer placed in the array reference, but in the hash reference. For example, the following XML, pay attention to the difference with the above results:

<opt>
    <user id = "grep" fullname = "Gary R Epstein" />
    <user id = "stty" fullname = "Simon T Tyson">
        <session pid = "12345" />
    </ user>
    <text> This is a test. </ text>
</ opt>
$ VAR1 = {
          'text' => [
                    'This is a test.'
                  ],
          'user' => {
                    'grep' => {
                              'fullname' => 'Gary R Epstein'
                            },
                    'stty' => {
                              'session' => [
                                           {
                                             'pid' => '12345'
                                           }
                                         ],
                              'fullname' => 'Simon T Tyson'
                            }
                  }
        };
The content of user is no longer an array reference, but a hash reference, and id = 'grep' also becomes a key.

To disable this feature, you should specify the option KeyAttr => ''. This option means that which attributes should be used as hash keys during parsing. The default values are ['id', 'name', 'key'].

In the XML :: Simple documentation, all options are described in detail, and the KeyAttr and ForceArray options are marked as important, showing how common they are.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.