Recently often need to download something, and this will go through layers of jumps, each page has a lot of ads, annoying, so do a one-click to get the final gadget. Use C #, to get the content of the Web page, and then get the href of a tag through htmlagilitypack, keep looping, layer jump, and finally get the final.
Below, the introduction of the use of Htmlagilitypack, this method is also from a lot of articles on the web to find out, because found a big circle can not find Htmlagilitypack documents ...
First, using Htmlagilitypack;
Code snippet:
String url = "http://www.baidu.com/"; Htmlweb htmlweb = new Htmlweb (); HTMLDocument htmldoc = Htmlweb. Load (URL); Htmlnodecollection HNC = Htmldoc. Documentnode.selectnodes ("//div[@class = ' S_BTN_WR ']//input[@class = ' s_btn ']"); string text = Hnc[hnc. COUNT-1]. attributes["Value"]. Value;
Points:
Htmlweb. Load: The parameter is the URL address, and the function is to load the URL page contents into the Htmldoc object.
Htmlnodecollection: is a collection that contains multiple nodes, it has the Count property, can get the number, gets the last node using HNC[HNC. COUNT-1].
Htmldoc. Documentnode.selectnodes: Select all matching nodes, the argument is XPath syntax, you can search the syntax of w3cshool, I mean to get a div that contains class equals S_BTN_WR, the class equals S_ The input node of the BTN.
Hnc[hnc. COUNT-1]. attributes["Value"]. Value: Gets the property value of the last node's value.
After a basic understanding of the above content, I think you have basically been able to complete the functions you want to do.
Reprint please indicate source, original link:http://fengyu.name/?cat=coding&id=294
"C #" get Web content and the use of HTML parser Htmlagilitypack