Parsing HTML Data

Source: Internet
Author: User

HTML data parsing with open source code HTMLPARSER:HTMLNODE.M HTMLNode.h htmlparser.m HTMLParser.h

To this URL can be found: https://github.com/

There are three steps ahead of parsing your data:

1 adding LIBXML2 libraries to your project

2: Add/USR/INCLUDE/LIBXML2 in header Search path

3: Add the Open source code to the project. And in the introduction header file

So we can start parsing the HTML data.

First, we download a random HTML data. (Here is an example so just download with a simple sync, in your own application to use asynchronous download)

Nsurl *url=[nsurl urlwithstring:@"http://vip.astro.sina.com.cn/iframe/astro/view/cancer/day/"]; NSString*htmlstr=[[NSString Alloc]initwithcontentsofurl:url encoding:nsutf8stringencoding Error:nil]; Nserror*error; //Parsing HTML documents//to create an Htmlparser objectHtmlparser *parser = [[Htmlparser alloc] Initwithstring:str error:&ERROR]; if(Error) {NSLog (@"%@", error); return; }

It puts the HTML data in a Htmlparser object.

The content of this HTML data is shown here (a certain limitation here is completely useless for us to remove)

<! DOCTYPE HTML Public"-//W3C//DTD XHTML 1.0 transitional//en" "HTTP://WWW.W3.ORG/TR/XHTML1/DTD/XHTML1-TRANSITIONAL.DTD">"http://www.w3.org/1999/xhtml">"Content-type"Content="text/html; Charset=utf-8"/><title> Cancer _ Daily Horoscope _ Constellation Channel _ Sina </title><link href="Http://vip.astro.sina.com.cn/app/astro/css/mindcity_utf8.css"Rel="stylesheet"Type="Text/css"/>ID="West"><divID="Middle2"><divclass="IoT"><divclass=" Left" ID="weiboimg">"Http://image2.sina.com.cn/ast/2007index/tmp/star_php/cancer_b.gif"Border="0"align=" Left"/><span> Cancer <em> ./ A- -/ A</em></span></div>author Yu-Love Sina exclusive writer</cite></div><divclass="Clear"></div><ulclass="Daysnav"><liclass="Buton"><a href='/astro/view/cancer/day/20140816'> Today's horoscope </a></li><liclass="Butof"><a href='/astro/view/cancer/day/20140817'> Tomorrow's horoscope </a></li><liclass="Datea"> Valid date: the- ,- -</li></ul><divclass="Tab">class="Clear"></div><divclass="Tab"> -%</p></div><divclass="Tab"> -%</p></div><divclass="Clear"></div><divclass="Tab">class="Tab">2</p></div><divclass="Clear"></div><divclass="Tab">class="Tab">class="Clear"></div></div><divclass="Clear"></div><divclass="lotconts"> Shape on the gorgeous let the inner also add sexy charm, in addition to intelligence together with the multiplier effect, not only beautiful, but also beautiful smart yo.
And today, for women, you will be able to get a good evaluation of your sensibility, such as handicrafts, and the need for skillful and thoughtful interest.
Hobbies have good works can be a premonition of the period. </div></div><!--Horoscope Content End--><divclass="Clear"></div></div></div></body>

Now we can use the method in the Htmlparse to do a step-by-step analysis of the

//gets the body part of the HTMLHtmlnode *node =[parser body]; //On the basis of node, find andThe words that have been summed upHtmlnode *sum=[node Findchildofclass:@"lotconts"];//This method is to find a property named "lotconts"   node//On the basis of node, find and fetch  Get valid dateHtmlnode *effectdate=[node Findchildofclass:@"Datea"]; //On the basis of node, find and fetch  get the constellation nameHtmlnode *name=[node Findchildtag:@"span"]; //This method is to find a label called "  span the node

//On the basis of node, find and fetch Get constellation time period
Htmlnode *time=[name Findchildtag:@"em"];
//get a link to the constellation Picture
Htmlnode *image=[node Findchildtag:@"img"];
NSString*pic=[image getattributenamed:@"src"];
Nsurl*url1=[nsurl Urlwithstring:pic];

Get the content of the node
[Name contents];//returns a string here is the content: cancer
Similarly
[Sum contents];//content is: The shape of the gorgeous let the inner also add sexy charm, in addition to the wisdom of the multiplication function together, not only beautiful, but also beautiful wisdom yo.
And today, for women, you will be able to get a good evaluation of your sensibility, such as handicrafts, and the need for skillful and thoughtful interest.
Hobbies have good works can be a premonition of the period.

The parsing of HTML data is based on: tags, attributes (Attribute) to use the method in Htmlparser, to find the child nodes we need.

Remember that the last we found are child nodes, we want to get content or to-(nsstring*) contents; method to obtain.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.