Htmlagilitypack Enterprise Data Import Analysis Summary (HTML analysis)

Source: Internet
Author: User

Some time ago the company in the mountain Aluminum weighbridge computer room copy data, the data into their own system, want to automatically import the data of the Weighbridge, I went to see the next, found that the export file format is HTML, sometime found the Htmlagilitypack this artifact.

The code is messy, the idea is not clear, but the final effect is good. 2015-01-25

This is the file that needs to be imported, it is the data of the cells inside the table, and here is one row

Download Import DLL step is not to repeat, Baidu has

One Declare the Htmlagilitypack.htmldocument object, load the HTML data file that needs to be imported for the instance (a lot of the Web analytics is to download the HTML into memory and then read by Htmlagilitypack), then select the tab you want to get the node, I'm here <font></font>,selectnodes ("//font[@*]"), can be defined by themselves, or more according to the need to define a few, so that All the data in the font tag is read in the node array htmlnodecollection, and then it's up to you to get the data you need from here.

1 usingHtmlagilitypack;2 3 //Htmlagilitypack loaded HTML for HTMLDocument4 5Htmlagilitypack.htmldocument HD =Newhtmlagilitypack.htmldocument ();6 7 HD. Load (FilePath, Utf8encoding.utf8);8 9Htmlnode RootNode =HD. Documentnode;Ten  OneHtmlnodecollection categorynodelist = Rootnode.selectnodes ("//font[@*]");//Get the node tree based on XPath

Second, the simple introduction of how to get to the node array to traverse to their own required data

1 
foreach is the most ergodic effect.
Get the total number of cars imported
foreach(Htmlnode Iteminchcategorynodelist)2 {3 if(item. Innertext.contains ("Number of cars"))4 {5Counttemp = Int32.Parse (Categorynodelist[categorynodelist.indexof (item) +1]. Innertext.trim ());6 Break;7 }8}

Summarize

In order to make this find a few plug-ins, are not good, such as Sgmlreaderdll, Winista.htmlparser, efficiency and applicability and function are not as good as htmlagilitypack, regular expression also tried, HTML is too complex, regular expression will not write, lazy people do not spray. Attached are several download plugins Http://pan.baidu.com/s/1bno4SUF

Htmlagilitypack Enterprise Data Import Analysis Summary (HTML analysis)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.