Introduction to AngleSharp, a DLL component used to parse xHTML source code
AngleSharp is A. NET (C #)-Based DLL component specially designed for parsing xHTML source code.
Project address: https://github.com/FlorianRappl/AngleSharp
I will mainly introduce some commonly used AngleSharp methods. I will introduce them to you, and I will use the help site as a prototype. Other similar components include:
China: Jumony
Github address: https://github.com/Ivony/Jumony
Foreign: Html Agility Pack
Address: http://htmlagilitypack.codeplex.com/
For details, you can search for the differences and performance of the three items. Next, we will discuss the main role of AngleSharp.
Introduce AngleSharp to the project and use NuGet to execute commands (in fact, I am installing force .) Install-Package AngleSharp
Add reference Using AngleSharp to the Project
First, obtain the HTML source code of the CnBlogs homepage.
static public string GetHtml(){ HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://www.bkjia.com"); HttpWebResponse response = (HttpWebResponse)myReq.GetResponse(); // Get the stream associated with the response. Stream receiveStream = response.GetResponseStream(); // Pipes the stream to a higher level stream reader with the required encoding format. StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8); return readStream.ReadToEnd();}
Get the title of all current blog posts on the jb51 Homepage
Private static void Main (string [] args) {// find the title of all articles string cnblogsHtml = GetHtml (); // load HTML var document = DocumentBuilder. html (cnblogsHtml); // you must use = here. You cannot use Equals var titleItemList = document. all. where (m => m. className = "titlelnk"); int iIndex = 1; foreach (var element in titleItemList) {Console. writeLine (iIndex + ":" + element. innerHtml); iIndex ++ ;}}
The output content of the above Code:
1: JNDI learning Summary (iii) -- use Druid in Tomcat to configure the JNDI data source 2: How do we communicate with the designer at the front end 3: MVC5 + EF6 full tutorial 6 4: differences between throttle and debounce auxiliary functions in common Javascript class libraries 5: Walking Alone 6: last week's hot spot Review (11.10-11.16) 7: Android animation-Tween) animation 8: python Implementation of Naive Bayes algorithm 9: MVC three-layer Cascade Method 10: C # label (Bar Code) printing and design (a) 11: OpenCASCADE Make Primitives-Box12: implementation of hbase secondary index based on solr 13: (16) problems caused by Offset compensation in WebGIS 14: javascript games-life game 15: Android animation-Frame Animation 16: C # Socket Study Notes 17: lua table sorting 18: ZooKeeper Series 1: ZooKeeper Quick Start 19: [plug-in Development] -- 9 editor code block coloring-highlighted! 20: University of Washington Computer Vision Course Notes (1)
The official website provides detailed documents and examples. You can take a look. The biggest advantage of this plug-in is that it supports the output of Javascript, Linq syntax, ID and Class selector, and dynamic addition of nodes. It is a powerful tool for. NET development.
AngleSharp document: https://github.com/FlorianRappl/AngleSharp/wiki/Documentation
AngleSharp example (Demo): https://github.com/FlorianRappl/AngleSharp/wiki/Examples