IOS 16th: XML parsing of Network Data

Source: Internet
Author: User

Data parsing in network applications, because recent applications, both Android and iOS platforms, have been using JSON parsing and are recommended,

XML parsing is a bit forgotten.

Then I am playing a small iOS app recently, involving Network Data Capturing. Some websites may provide their own API platforms, which are generally supported.

We set the Data Protocol format. However, when I found some RSS resources, I found that the returned data was in XML format,

Therefore, we have to use XML for parsing.

In fact, this concept of XML parsing has been around for a long time. In the past, Java Web was used everywhere. Here we will give a general introduction to how to use IOS programming.

XML parsing is generally divided into two modes: Sax and Dom, events, and documents. Go to Google for details. However, after reading the two examples below, we will generally understand them.

I. XML parsing-> sax parsing and the application of nsxmlparser.

To put it bluntly, it is a transaction model parsing. What should I do when I read the document from the beginning and then according to the read header tag? after reading the header tag, I read the tag value theoretically,

The end tag is displayed after reading the data.

A simple example

<RSS xmlns: content = "http://purl.org/rss/1.0/modules/content/" xmlns: DC = "http://purl.org/dc/elements/1.1/" version = "2.0"> header tag, xmlns inside, can be viewed as an attribute
<Channel>
<Title> Haha </title> ends the tag. The "Haha" in the middle is the space for the first and last tag values.

.......

Well, how to use it in iOS development.

The SDK provides an nsxmlparser parser.

-(Bool) parser :( nsstring *) string {// nsxmlparser * par = [[nsxmlparser alloc] initwithdata: [String datausingencoding: nsutf8stringencoding] autorelding]; [par setdelegate: Self]; // sets the nsxmlparser object resolution method proxy return [par parse]; // calls the proxy to parse the nsxmlparser object, check whether the resolution is successful} # pragma mark xmlparser // Step 1: Prepare for resolution-(void) parserdidstartdocument :( nsxmlparser *) parser {// nslog (@ "% @", nsstringfromselector (_ cmd);} // Step 2: Prepare the parsing node-(void) parser :( nsxmlparser *) parser didstartelement :( nsstring *) elementname namespaceuri :( nsstring *) namespaceuri qualifiedname :( nsstring *) QNAME attributes :( nsdictionary *) attributedict {// nslog (@ "% @", nsstringfromselector (_ cmd);} // Step 3: get content between the beginning and end nodes-(void) parser :( nsxmlparser *) parser foundcharacters :( nsstring *) string {nslog (@ "% @", nsstringfromselector (_ cmd ));} // Step 4: parse the current node-(void) parser :( nsxmlparser *) parser didendelement :( nsstring *) elementname namespaceuri :( nsstring *) namespaceuri qualifiedname :( nsstring *) QNAME {nslog (@ "% @", nsstringfromselector (_ cmd);} // Step 5; resolution end-(void) parserdidenddocument :( nsxmlparser *) parser {// nslog (@ "% @", nsstringfromselector (_ cmd);} // obtain CDATA block data-(void) parser :( nsxmlparser *) parser foundcdata :( nsdata *) cdatablock {// nslog (@ "% @", nsstringfromselector (_ cmd ));}

1. initialize the parser and pass in the data you want to parse.

2. parse: Start parsing and return a Boolean value indicating whether the resolution is successful.

3. Basically, you have to deal with the 1-5 proxy methods implemented below.

In fact, the proxy method and details are the process of a thing:

Step 1 is to prepare for parsing, and then it is executed without accident -->

Step 2 reads the first node, and if there is an internal attribute value, you can obtain it. After reading the first node, we will go to the value range --"

Step 3 for a simple node, it may be a string value directly, but we will know from the example that in many cases, the value field of the node contains a node --"

This step is actually divided into two types. If it is a value, it is to execute step 4 and get the value. If it is a subnode, we will know it at a glance, it is also carried out step 2,

That is to say, you have read the header tag. In fact, you are reading a piece of article in the same way, but we have an impression in our mind that <XXX> is the header tag. What should we do, the last symbol of the unique header tag ">"

The value field is entered below. If it is unique, the foundcharacters (nsstring *) string is called. If you read <XXX> again, it is a header label. --"

Step 5 is to read the end label symbol.

Last Method

Foundcdata :( nsdata *) cdatablock is actually a format

<Content: encoded> <! [CDATA [ <a href =" http://www.douban.com/people/maldini/ "> weight reduction </a> comment: <a href = "http://movie.douban.com/subject/6799191//"> Search </a> <br/> rating: Recommendation <br/>]> </content: encoded>

Now, I have a general understanding of the methods and procedures. Taking a recent example, we may encounter reading a group of data similar to arrays in JSON.

<Channel> <title> I'm title </title> <link> http://write.blog.csdn.net/postedit </link> <description>... </description> <language> ZH-CN </language> <pubdate> Fri, 03 Aug 2012 06:20:31 GMT </pubdate> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> <item>... </item> </channel>

In general, the data we need is actually these 20 items, right? Each item has three identical tags: title, author, and time. Actually, in the form of an array, how can we parse it? 1. First, we may declare an array container to store the 20 objects, and then each item object contains three elements, we can consider using a dictionary data structure to represent each item. 2. You need to apply for at least two spaces to store the name of the currently executed node and the value of the node as "Sentinel. 3. Then, each time the item is read, a dictionary data structure we mentioned above is initialized. 4. Keep the latest value in the foundcharacters method (of course, there will actually be a small flaw, which will be discussed below ). 5. In the tag end method, we store the Tag Name and value pairs in the dictionary container initialized above.

# Pragma mark xmlparser // Step 1: Prepare for parsing-(void) parserdidstartdocument :( nsxmlparser *) parser {// nslog (@ "% @", nsstringfromselector (_ cmd )); parserobjects = [[nsmutablearray alloc] init];} // Step 2: Prepare the parsing node-(void) parser :( nsxmlparser *) parser didstartelement :( nsstring *) elementname namespaceuri :( nsstring *) namespaceuri qualifiedname :( nsstring *) QNAME attributes :( nsdictionary *) attributedict {// nslog (@ "% @", nsstringfromselector (_ cmd); self. currenttext = [[nsmutablestring alloc] init]; [currenttext release]; If ([elementname isinclutostring: @ "item"]) {nsmutabledictionary * newnode = [[nsmutabledictionary alloc] alias: 0]; twitterdic = newnode; [parserobjects addobject: newnode]; [newnode release];} else if (twitterdic) {nsmutablestring * string = [[nsmutablestring alloc] Priority: 0]; [twitterdic setobject: String forkey: elementname]; [String release]; currentelementname = elementname ;}// Step 3: Get content between the first and last nodes-(void) parser :( nsxmlparser *) parser foundcharacters :( nsstring *) string {nslog (@ "% @", nsstringfromselector (_ cmd); [currenttext appendstring: String];} // Step 4: parse the current node-(void) parser :( nsxmlparser *) parser didendelement :( nsstring *) elementname namespaceuri :( nsstring *) namespaceuri qualifiedname :( nsstring *) QNAME {If ([elementname isinclutostring: @ "item"]) {twitterdic = nil;} else if ([elementname isinclutostring: currentelementname]) {If ([elementname isinclutostring: @ "Description"] | [elementname isinclutostring: @ "content: encoded"]) {[twitterdic setobject: CDATA forkey: currentelementname];} else {[twitterdic setobject: currenttext forkey: currentelementname] ;}}// Step 5; resolution end-(void) parserdidenddocument :( nsxmlparser *) parser {} // obtain CDATA block data-(void) parser :( nsxmlparser *) parser foundcdata :( nsdata *) cdatablock {CDATA = [[nsstring alloc] initwithdata: cdatablock encoding: nsutf8stringencoding];}

What are the points in the above Code? 1. I think there are several memory leaks ~~~ 2. Why not get the value directly currentstring = string this is a problem found in practice-(void) parser :( nsxmlparser *) parser foundcharacters :( nsstring *) String
Note the comments of this proxy method
// This returns the string of the characters encountered thus far. you may not necessarily get the longest character run. the parser reserves the right to hand these to the delegate as potentially extends callin a row to-Parser: foundcharacters:
The following is Google translation.
This will return the character string encountered so far. You may not get the longest running character. The parser has the right to hand over the parser may call for these delegates multiple times: foundcharacters:
This indicates that this method may be called multiple times to obtain the character data between the beginning and end of a tag.
The simplest column I have ever encountered
<Copyright> & copy; 2012, douban.com. </Copyright>
When I parsed this node, the above method was called twice,
Only & is returned for the first time, followed by copy; 2012, douban.com.
Therefore, if you want to obtain the complete record, you should use the string append method to obtain the complete record. 3. The resolution speed is optimized. For example, if we only need the data in item, the record in the unique non-domain does not need to be saved in the record. Therefore, I made a release and determination of a dictionary. To reduce the value assignment of string in found, you can also add a tag location to control it globally, but in general, this optimization is basically minimal, in addition, the memory is inexplicably leaked ~~ 4. you can refer to the above ideas. Because the code is too busy to write and there is a problem with writing memory, you should not refer to it. Haha, I will change it in a few days, you can make a good RSS parsing template. Ii. Dom document Parsing Model: tbxml third-party package application. The Dom Parsing Model is like a tree structure, node, subnode, sibling node, and so on. In fact, this was finally abandoned by me. This parser is too simplified and too concise, resulting in too few entry points for control. For example, the concept of a one-click optimization software is the same, one-Click Clear cache, optimized configuration, file classification, and so on. Human control is less, so when I parse the model above, I only know how to traverse the storage ~. However, in this parsing period, XML parsing that does not have high requirements is actually quite simple.

//-(Void) Recurrence :( tbxmlelement *) element {// nsstring * elename = [tbxml elementname: element]; // nsstring * eletext = [tbxml textforelement: element]; // If ([elename isw.tostring: @ "item"]) {// self recurrence: element //} // do {// nsstring * elename = [tbxml elementname: element]; // nsstring * eletext = [tbxml textforelement: element]; // recursively process the subtree // If (element-> firstchild) {// nslog (@ "<% @>:", elename); // display the name of the element ///[self recurrence: element-> firstchild]; //} else {// nslog (@ "<% @>: % @", elename, eletext ); // display the name of the element // tbxmlelement * parent = element-> parentelement; // If ([tbxml elementname: parent] isequaltostring: @ "item"]) {// nlrssinfo * info = [[[nlrssinfo alloc] init] autorelease]; // If ([elename isw.tostring: @ "title"]) {// info. title = eletext; //} // [dataarr addobject: info]; ///} // process the sibling tree by iteration //} while (element = element-> nextsibling ));//}

Recursive traversal, regular Tree operations, the specific content can be searched on the Internet, a lot.

In addition, the advantage of the open-source library is that it has source code, namely three classes and six files. If you are interested, you can study it. It seems that most of the Code is written in C.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.