Parsing HTML Tool ANGLESHARP Introduction
Anglesharp is based on. NET (C #) developed a DLL component specifically for parsing XHTML source code.
Project Address: Https://github.com/FlorianRappl/AngleSharp
I mainly introduce is some use anglesharp commonly used method, with everybody introduction, I will take http://www.cnblogs.com site as the prototype. Other similar components are:
Domestic:jumony
GitHub Address: Https://github.com/Ivony/Jumony
Author Blog address: http://www.cnblogs.com/Ivony/
Abroad:Html Agility Pack
Project Address: http://htmlagilitypack.codeplex.com/
In particular, we can search for the difference and performance of the three by ourselves. Next, we'll discuss the main character Anglesharp .
Introduce Anglesharp to the project and execute the command with the NuGet tool (actually I'm faking it.) )Install-Package AngleSharp
Adding references using Anglesharp in your project
First we get the HTML source code of Cnblogs home page
static public string GetHtml(){ HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://www.cnblogs.com"); HttpWebResponse response = (HttpWebResponse)myReq.GetResponse(); // Get the stream associated with the response. Stream receiveStream = response.GetResponseStream(); // Pipes the stream to a higher level stream reader with the required encoding format. StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8); return readStream.ReadToEnd();}
Get the title of all current blog posts on cnblogs home page
private static void Main (string[] args) {//Find all article title string cnblogshtml = Gethtml ();
//加载HTML var document = DocumentBuilder.Html(cnblogsHtml); //这里必须要使用== 不能使用Equals var titleItemList = document.All.Where(m => m.ClassName == "titlelnk"); int iIndex = 1; foreach (var element in titleItemList) { Console.WriteLine(iIndex + ":" + element.InnerHtml); iIndex++; }}
The above code output: 1:jndi Learning Summary (iii)--TOMCAT using Druid configuration Jndi data source
2: How our front end communicates with the designer
3:mvc5+ef6 Getting Started complete tutorial Six
4: A tentative discussion on the difference between throttle and debounce auxiliary functions in common JavaScript libraries
5: Lonely walk through the young
6: Last week's Hot Review (11.10-11.16)
7:android Animation-Tween (Tween) animation
8: Python implementation of naive Bayesian algorithm
9:mvc three cascading methods
10:c# label (barcode) printing and Design (i.)
11:opencascade make Primitives-box
12: Two-level index for HBASE implementation based on SOLR
13: (16) Discussion on the problems caused by offset compensation in Webgis
14:javascript Games-Life Games
15:android Animation-Frame animation
16:c# Socket Learning Note one
17:lua table Sorting
18:zookeeper Series First article: Zookeeper Quick Start
19: "Plugin development"--9 editor code block shading-highlighting!
20: University of Washington Computer Vision Course Note (i)
The official has provided the detailed document and the example, everybody may go to look. The biggest advantage of this plugin is that it supports output Javascript,Linq syntax,ID and Class selectors, dynamically adding nodes, and supporting Xpath syntax. The real. NET development of the tool.
Anglesharp Document: Https://github.com/FlorianRappl/AngleSharp/wiki/Documentation
Anglesharp Example (Demo): Https://github.com/FlorianRappl/AngleSharp/wiki/Examples
Parsing HTML Tool ANGLESHARP Introduction