Parsing HTML Data

Last Update:2014-08-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

HTML data parsing with open source code HTMLPARSER:HTMLNODE.M HTMLNode.h htmlparser.m HTMLParser.h

To this URL can be found: https://github.com/

There are three steps ahead of parsing your data:

1 adding LIBXML2 libraries to your project

2: Add/USR/INCLUDE/LIBXML2 in header Search path

3: Add the Open source code to the project. And in the introduction header file

So we can start parsing the HTML data.

First, we download a random HTML data. (Here is an example so just download with a simple sync, in your own application to use asynchronous download)

Nsurl *url=[nsurl urlwithstring:@"http://vip.astro.sina.com.cn/iframe/astro/view/cancer/day/"]; NSString*htmlstr=[[NSString Alloc]initwithcontentsofurl:url encoding:nsutf8stringencoding Error:nil]; Nserror*error; //Parsing HTML documents//to create an Htmlparser objectHtmlparser *parser = [[Htmlparser alloc] Initwithstring:str error:&ERROR]; if(Error) {NSLog (@"%@", error); return; }

It puts the HTML data in a Htmlparser object.

The content of this HTML data is shown here (a certain limitation here is completely useless for us to remove)

<! DOCTYPE HTML Public"-//W3C//DTD XHTML 1.0 transitional//en" "HTTP://WWW.W3.ORG/TR/XHTML1/DTD/XHTML1-TRANSITIONAL.DTD">"http://www.w3.org/1999/xhtml">"Content-type"Content="text/html; Charset=utf-8"/><title> Cancer _ Daily Horoscope _ Constellation Channel _ Sina </title><link href="Http://vip.astro.sina.com.cn/app/astro/css/mindcity_utf8.css"Rel="stylesheet"Type="Text/css"/>ID="West"><divID="Middle2"><divclass="IoT"><divclass=" Left" ID="weiboimg">"Http://image2.sina.com.cn/ast/2007index/tmp/star_php/cancer_b.gif"Border="0"align=" Left"/><span> Cancer <em> ./ A- -/ A</em></span></div>author Yu-Love Sina exclusive writer</cite></div><divclass="Clear"></div><ulclass="Daysnav"><liclass="Buton"><a href='/astro/view/cancer/day/20140816'> Today's horoscope </a></li><liclass="Butof"><a href='/astro/view/cancer/day/20140817'> Tomorrow's horoscope </a></li><liclass="Datea"> Valid date: the- ,- -</li></ul><divclass="Tab">class="Clear"></div><divclass="Tab"> -%</p></div><divclass="Tab"> -%</p></div><divclass="Clear"></div><divclass="Tab">class="Tab">2</p></div><divclass="Clear"></div><divclass="Tab">class="Tab">class="Clear"></div></div><divclass="Clear"></div><divclass="lotconts"> Shape on the gorgeous let the inner also add sexy charm, in addition to intelligence together with the multiplier effect, not only beautiful, but also beautiful smart yo.
And today, for women, you will be able to get a good evaluation of your sensibility, such as handicrafts, and the need for skillful and thoughtful interest.
Hobbies have good works can be a premonition of the period. </div></div><!--Horoscope Content End--><divclass="Clear"></div></div></div></body>
Now we can use the method in the Htmlparse to do a step-by-step analysis of the
//gets the body part of the HTMLHtmlnode *node =[parser body]; //On the basis of node, find andThe words that have been summed upHtmlnode *sum=[node Findchildofclass:@"lotconts"];//This method is to find a property named "lotconts"   node//On the basis of node, find and fetch  Get valid dateHtmlnode *effectdate=[node Findchildofclass:@"Datea"]; //On the basis of node, find and fetch  get the constellation nameHtmlnode *name=[node Findchildtag:@"span"]; //This method is to find a label called "  span the node

//On the basis of node, find and fetch  Get constellation time period
Htmlnode *time=[name Findchildtag:@"em"];
    //get a link to the constellation Picture 
Htmlnode *image=[node Findchildtag:@"img"];
NSString*pic=[image getattributenamed:@"src"];
Nsurl*url1=[nsurl Urlwithstring:pic];

Get the content of the node
[Name contents];//returns a string here is the content: cancer
Similarly
[Sum contents];//content is: The shape of the gorgeous let the inner also add sexy charm, in addition to the wisdom of the multiplication function together, not only beautiful, but also beautiful wisdom yo.
And today, for women, you will be able to get a good evaluation of your sensibility, such as handicrafts, and the need for skillful and thoughtful interest.
Hobbies have good works can be a premonition of the period.

The parsing of HTML data is based on: tags, attributes (Attribute) to use the method in Htmlparser, to find the child nodes we need.
Remember that the last we found are child nodes, we want to get content or to-(nsstring*) contents; method to obtain.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Parsing HTML Data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Parsing HTML Data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support