Performance Comparison of XML data reading methods (II)

Last Update:2017-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the previous period, we summarized the general xml reading method, but we do not need to use all the data from the XML source at ordinary times. so I tried to read some of the data, for example, you can select a position based on the start letter of the title. In the previous period, we summarized the general xml reading method, but we do not need to use all the data from the XML source at ordinary times. so I tried to read some of the data, for example, you can select a position based on the start letter of the title.

For the three random read methods, you only need to change the query conditions.

XmlDocument:var nodeList = doc.DocumentElement.SelectNodes("item[substring(title,1,1)='M'][position() mod 10 = 0]");　　XPathNavigator:var nodeList = nav.Select("/channel/item[substring(title,1,1)='M'][position() mod 10 = 0]");　　Xml Linq:var nodelist = from node in xd.XPathSelectElements("/channel/item[substring(title,1,1)='M'][position() mod 10 = 0]")

To use XPath, you only need to change the code line. XPath is easier to master than SQL. You can refer To W3C Shcool syntax introduction and MSDN's LINQ To XML for XPath users. you will be able To grasp the mysteries within a quarter of an hour.

However, the XmlReader method is not that easy. it also reads the title starting with M and takes one item for every ten items. after half a day, I did not come up with an elegant implementation method, so:

Code

static List
 
   testXmlReader2(){    var lstChannel = new List
  
   ();    var reader = XmlReader.Create(xmlStream);    int n = 0;Channel channel = null;Search:    while (reader.Read())    {        if (reader.Name == "item" && reader.NodeType == XmlNodeType.Element)        {              while (reader.Read())            {                if (reader.Name == "item") break;                if (reader.NodeType != XmlNodeType.Element) continue;                switch (reader.Name)                {                    case "title":                        var title = reader.ReadString();                        if (title[0] != 'M') goto Search;                                  n++;                        if (n % 10 != 0) goto Search;                         channel = new Channel();                        channel.Title = title;                        break;                    case "link":                        channel.Link = reader.ReadString();                        break;                    case "description":                        channel.Description = reader.ReadString();                        break;                    case "content":                        channel.Content = reader.ReadString();                        break;                    case "pubDate":                        channel.PubDate = reader.ReadString();                        break;                    case "author":                        channel.Author = reader.ReadString();                        break;                    case "category":                        channel.Category = reader.ReadString();                        break;                    default:                        break;                }                lstChannel.Add(channel);            }        }    }    return lstChannel;}

We can see that the code structure has changed significantly. For conditional filtering, we only need to add the local variable n, adjust the object class initialization, and add the position of the set statement, even forced to use the goto statement that has been forgotten for many years to jump (VB is better ). The business logic is infiltrated into the implementation of code details. in Lao Zhao's words, there is a burst of Syntactic noise.

XmlTextReader's implementation proxy class XmlTextReaderImp (internal, cannot be used directly) is a super class with tens of thousands of lines of code and encapsulates a large number of operations directly on Xml characters. Since the operation is very close to the underlying layer, it is difficult to find a very good code optimization method at the macro level. If the filtering conditions, that is, the business logic is a little more complex, the code will be completely invisible, and the maintainability of comprehensibility will be shown in the image.

Now let's compare the time performance:

XmlDocment    26ms    XPathNavigator    26ms    XmlTextReader    20ms    Xml Linq    28ms

The data in the four methods has become closer, the time consumed by Document and Navigator has been greatly reduced, and the Reader mode has not decreased much because it still needs to be Read from the beginning to the end, the cost of object creation is reduced by 3 ms. What is strange is that the Linq method has not changed, and it is at the end.

You can test different query conditions. it can be seen that these four methods have their own performance limits, which are related to the Xml source size. For example, the first two methods depend on the execution time of the XmlDocument. Load method. on my local machine, it takes 23 ms to Load the Xml. The Linq method does not remain unchanged. if the processing result is few, the execution time will be reduced by 1 ~ 2 ms.

Document and Navigator methods, the performance will decrease significantly as the data volume increases. It is easy to guess because they have created many useless objects. Take a look at the memory usage of each mode. when all data is loaded and not filtered, the Document mode occupies about MB of memory, while the Navigator mode only needs about MB, this also explains why the Document mode performance decline is more obvious. The Reader mode supports full data loading. it only takes about 1 MB of memory, excluding the overhead of the program startup, which is less than half of the memory occupied by the first two types. In terms of memory, the Linq method boasts amazing performance, with only less than 500 k more than the Reader method.

Further analysis draws a further conclusion: unless there is a special need, use XmlTextReader with caution. it is not prepared enough for changes and is prone to errors. It is strongly recommended to use the Linq method. although the time performance is slightly lower than that of the Navigator method in some cases, the outstanding memory usage has laid its first choice. And I believe that the future of Linq To XML will be more powerful.

The above content compares the performance of the XML data reading method (2). For more information, see PHP Chinese network (www.php1.cn )!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Performance Comparison of XML data reading methods (II)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Performance Comparison of XML data reading methods (II)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support