Parse Html data in wIndows phone 7

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In my previous article, I introduced gb2312 Decoding for windows phone 7,

Http://www.cnblogs.com/qingci/archive/2011/11/25/2263124.html

This article describes how to parse Html data in windows phone 7 to obtain the desired data.

Here, I will first introduce a class library HtmlAgilityPack (this tool was used to decode in the previous article). The dll file of the class library will be provided along with the demo

Here, I use Sina news as an example to parse data

Let's take a look at Sina news on the webpage

Http://news.sina.com.cn/w/sd/2011-11-27/070023531646.shtml

Then let's take a look at his source file,

The structure of news content is found to be

The result looks like this:

Most tags on web pages have no ID attribute, but fortunately HtmlAgilityPack supports XPath

Then you need to find the matching node through XPATH language

XPath tutorial: http://www.w3school.com.cn/xpath/index.asp

Zh

Case download:

http://115.com/file/dn87dl2d#
MyFramework_Test.zip

Most also have the ID attribute, which is more suitable for us to parse.

Next we start to parse

First: Reference the HtmlAgilityPack.dll file

Second: Use the WebClient or WebRequest class to download the HTML page and process it into a string.

 public  delegate void CallbackEvent(object sender, DownloadEventArgs e);
        public  event CallbackEvent DownloadCallbackEvent;
        public void HttpWebRequestDownloadGet(string url)
        {
            
            Thread _thread = new Thread(delegate()
            {
                Uri _uri = new Uri(url, UriKind.RelativeOrAbsolute);
                HttpWebRequest _httpWebRequest = (HttpWebRequest)WebRequest.Create(_uri);
                 _httpWebRequest.Method="Get";
              
                _httpWebRequest.BeginGetResponse(new AsyncCallback(delegate(IAsyncResult result)
                {
                    HttpWebRequest _httpWebRequestCallback = (HttpWebRequest)result.AsyncState;
                    HttpWebResponse _httpWebResponseCallback = (HttpWebResponse)_httpWebRequestCallback.EndGetResponse(result);
                    Stream _streamCallback = _httpWebResponseCallback.GetResponseStream();

                    StreamReader _streamReader = new StreamReader(_streamCallback,new HtmlAgilityPack.Gb2312Encoding());
                    string _stringCallback = _streamReader.ReadToEnd();
                 
                    Deployment.Current.Dispatcher.BeginInvoke(new Action(() =>
                    {
                        if (DownloadCallbackEvent != null)
                        {
                            DownloadEventArgs _downloadEventArgs = new DownloadEventArgs();
                            _downloadEventArgs._DownloadStream = _streamCallback;
                            _downloadEventArgs._DownloadString = _stringCallback;
                            DownloadCallbackEvent(this, _downloadEventArgs);

                        }
                    }));

                }), _httpWebRequest);
            }) ;
            _thread.Start();
        }
       // }

O (∩_∩) O! I am more complicated. In short, we just download the html data.

Post a simple download method

WebClient webClenet = new WebClient ();

          webClenet.Encoding = new HtmlAgilityPack.Gb2312Encoding (); // Add this sentence to set the encoding

          webClenet.DownloadStringAsync (new Uri ("http://news.sina.com.cn/s/2011-11-25/120923524756.shtml", UriKind.RelativeOrAbsolute));

          webClenet.DownloadStringCompleted + = new DownloadStringCompletedEventHandler (webClenet_DownloadStringCompleted);

Now handle e.Result of callback function

string _result = e._DownloadString;

            HtmlDocument _doc = new HtmlDocument (); // Instantiate HtmlAgilityPack.HtmlDocument object
            _doc.LoadHtml (_result); // Load HTML

            HtmlNode _htmlNode01 = _doc.GetElementbyId ("artibodyTitle"); // Div for news title
            string _title = _htmlNode01.InnerText;

            HtmlNode _htmlNode02 = _doc.GetElementbyId ("artibody"); // Get content div
            string _content = _htmlNode02.InnerText;
           // int _count = _htmlNode02.ChildNodes.Where (new Func <HtmlNode, bool> ("div"));
            int _divIndex = _content.IndexOf (".blkComment");

            _content = _content.Substring (0, _divIndex);

            #region Sina tags
            HtmlNode _htmlNodo03 = _doc.GetElementbyId ("art_source");
            string _www = _htmlNodo03.FirstChild.InnerText;
            string _wwwInt = _htmlNodo03.FirstChild.Attributes [0] .Value;
            #endregion
            // string _source = _htmlNodo03;
            //_htmlNodo03.ChildNodes

            #region release time
            HtmlNode _htmlNodo04 = _doc.GetElementbyId ("pub_date");
            string _pub_date = _htmlNodo04.InnerText;
            #endregion


            #region Source site information
            HtmlNode _htmlNodo05 = _doc.GetElementbyId ("media_name");
            string _media_name = _htmlNodo05.FirstChild.InnerText;
            string _modia_source = _htmlNodo05.FirstChild.Attributes [0] .Value;
            #endregion

            Media_nameHyperlinkButton.Content = _pub_date + "" + _media_name;
            Media_nameHyperlinkButton.NavigateUri = new Uri (_modia_source, UriKind.RelativeOrAbsolute);
            TitleTextBlock.Text = _title;
            ContentTextBlock.Text = _content;

The result looks like this:

Most tags on web pages have no ID attribute, but fortunately HtmlAgilityPack supports XPath

Then you need to find the matching node through XPATH language

XPath tutorial: http://www.w3school.com.cn/xpath/index.asp

Case download:

http://115.com/file/dn87dl2d#
MyFramework_Test.zip

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Parse Html data in wIndows phone 7

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Parse Html data in wIndows phone 7

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support