Website data acquisition and data acquisition
In this example, the html source code is parsed using HtmlAgilityPack to obtain the required data.
Using HtmlAgilityPack;
1. Obtain the webpage source code through the WebRequest, WebResponse, and StreamReader classes in C #.
WebRequest request = WebRequest. Create (url );
Using (WebResponse response = request. GetResponse ())
Using (StreamReader reader = new StreamReader (response. GetResponseStream (), encoding ))
Result = reader. ReadToEnd ();
2. Obtain the HtmlNode through the web URL and the HtmlDocument class in the HtmlAgilityPack.
HtmlAgilityPack. HtmlDocument document = new HtmlAgilityPack. HtmlDocument ();
Document. LoadHtml (htmlSource );
HtmlNode rootNode = document. DocumentNode;
Return rootNode;
3. you can use the SelectSingleNode method of HtmlNode to obtain the content you need. Note that the path in the following code is the HTML Tag path, for example: path = "// div [@ class = 'Article _ title']/h1/span/a"; // article title PATH
Corresponds
<Div class = 'Article _ title'>
<H1>
<Span>
<A> get the content here
</A>
</Span>
</H1>
</Div>
The source code is as follows:
HtmlNode temp = srcNode. SelectSingleNode (path );
If (temp = null)
Return null;
Return temp. InnerText;
Return Value: Get the content here
Temp. InnerHtml can obtain the HTML content of the website, for example: <a> obtain the content here </a>
Through the above operations can get to the site you need content, hope this content is helpful to everyone, reference source code article link http://blog.csdn.net/gdjlc/article/details/11620915
How to obtain data from other webpages
Use the WebRequest method to obtain website data:
Private string GetStringByUrl (string strUrl)
{
WebRequest wrt = WebRequest. Create (strUrl );
WebResponse wrse = wrt. GetResponse ();
Stream strM = wrse. GetResponseStream ();
StreamReader SR = new StreamReader (strM, Encoding. GetEncoding ("gb2312 "));
String strallstrm = SR. ReadToEnd ();
Return strallstrm;
}
Then write a method to process the data to obtain the value you want.
How can I obtain data from other websites?
Here is a piece of code. Refer!
/// <Summary>
/// Get the lottery number
/// </Summary>
/// <Returns> </returns>
Public string GetNum (string Issue)
{
String number = "";
String url = "Enter the address you want to capture ";
String rl = null;
Try
{
System. Net. WebRequest webRequest = System. Net. WebRequest. Create (url );
System. Net. WebResponse Response = webRequest. GetResponse ();
Stream resStream = Response. GetResponseStream ();
StreamReader sr = new StreamReader (resStream, Encoding. GetEncoding ("GB2312 "));
StringBuilder sb = new StringBuilder ();
While (rl = sr. ReadLine ())! = Null)
{
Sb. Append (rl );
}
Sr. Close ();
ResStream. Close ();
String str = sb. ToString ();
Regex rgLink = new Regex (@ "<\ s * link [^>] * ([^ <] | <(?! Link) */> ", RegexOptions. IgnoreCase );
MatchCollection mcLink = rgLink. Matches (str );
Foreach (Match matchLink in mcLink)
{
Str = str. Replace (matchLink. Value ,"");
}
Regex rgNum = new Regex (@ "<\ s * li [^>] *> ([^ <] | <(?! /Li) * <\ s */li \ s *> ", R... the remaining full text>