During web page automation, the most common task we do is to constantly get/post data to a URL and get the corresponding response, and then perform the next step by analyzing the response results, through web page automation, we can do a lot of work, such as crawling the data we want on a website, batch registration, simulating manual webpage operations, batch submission, last year, the 12306 online ticket booking system developed one after another to grab tickets is the best bet for this web page automation application. Fortunately,. Net C # provides a powerful function library to easily complete this job, but now the aliasnet I want to publish will make it easier. Is it easy for me? One sentence is OK.
In general:Aliasnet provides a simple, convenient, encapsulated, and unified get/POST method.
1. Basic use of aliasnet
StringResponse =NewGetandposthelper (). getstring ("Http://www.baidu.com");
OK. The HTML content of Baidu homepage will be saved in response.
2. advanced use of aliasnet
Getandposthelper () helper = New Getandposthelper (); // Initialize HTTP getandposthelper
Getandposthelper () helper = New Getandposthelper ( True ); // Initialize HTTPS getandposthelper
Helper. setcookies (cookiecontainer cookies ); // Set the cookie for this connection
Helper. setcredential (username, password ); // Set the authentication information for this connection (this is useful in the Intranet site)
Helper. setproxy (proxyip, proxyport ); // Set the proxy for this connection
Helper. getstring (URL ); // Get data, returns string
Helper. getbytes (URL ); // Get data, returns byte []
Helper. poststring (URL, parameter ); // Post Data, returns string
Helper. postbytes (URL, paremeter ); // Post Data, returns byte []
Helper. downloadfile (URL, filename ); // Save the URL content as a local file, which is useful for downloading image webpages.
When getting data, the parameter is transmitted to the URL as querystring to the server. However, when post data, we need to explicitly pass a parameter, which is a parameter-like entity, the parameter class is defined as follows:
/// <Summary>
/// Parameter class, inherited from dictionary <String, string>
/// Number of get/post files saved as key-Value
/// </Summary>
Public Class Parameters: dictionary <String , String >
{
/// <Summary>
/// Override the tostring method to connect each value in querystring mode.
/// </Summary>
/// <Returns> </returns>
Public Override String Tostring ()
{
Stringbuilder sb = New Stringbuilder ();
Foreach (Keyvaluepair < String , String > Pair In This )
{
SB. append ( String . Format ( " {0 }={ 1 }& " , Pair. Key, pair. Value ));
}
If (Sb. length> 0 )
SB. Remove (sb. Length- 1 , 1 );
Return SB. tostring ();
}
}
We can see that parameter is a string-string dictionary. We can put the data that requires post into it and it will be OK.
3. How to Deal with get/Post Response Data
In general, we only need get/post data. One is to download the corresponding file (you can use helper. downloadfile). The second is to obtain the returned results to extract data (you can use helper. getstring (poststrin. This article focuses on the latter.
Based on the results returned by the server, we can classify them into the following three types:
(1) HTML
Of course, the first push for processing HTML stringsHtmlagilitypack. dllNow, you can use html#net to load HTML strings, use XPath to search for desired nodes, and finally use the innertext, innerhtml, and getattribute methods of htmlnode to conveniently obtain data.
(2) JSON
I like JSON data processing most.Newtonsoft. JSONTo instantiate an object class and then operate the class for implementation. Aliasnet provides the jsonclassbase base class to process JSON data. Generally, it analyzes the JSON data structure and writes the corresponding data. net object class, and let it inherit from the jsonclassbase base class, use the call jsonclassbase. the fromjson <t> (string JSON) method can easily convert JSON data. net object class.
(3) Other elements including XML
4. instance: Use aliasnet to obtain the latest Proxy from the carefree proxy
Worry-free proxy is my favorite proxy network. every few hours it updates the latest and best-to-use proxy IP address and port, so we can use aliasnet to automatically capture proxy information.
The example below will capture an anonymous HTTP proxy for a http://www.51proxied.com/http_anonymous.html, that is, a worry-free proxy
// <Summary>
/// Get the latest Proxy from the carefree proxy website
/// </Summary>
/// <Param name = "proxyurl"> Proxy web URL </Param>
/// <Param name = "regionfilter"> Filter the specified region. "" Or null is not filtered. </Param>
/// <Returns> PROXY command </Returns>
Public Static List <proxy> _ 51proxied_httpproxy ( String Proxyurl, String Regionfilter)
{
String Resultstring = New Getandposthelper (). getstring (proxyurl );
Htmlagilitypack. htmldocument Doc = New Htmlagilitypack. htmldocument ();
Doc. loadhtml (resultstring );
Htmlnode divnode = Doc. getelementbyid ( " TB " );
Htmlnodecollection tdnodescollection = divnode. selectnodes ( " // DIV/table // TD " );
List New List Foreach (Htmlnode Node In Tdnodescollection)
Tdnodes. Add (node );
List <proxy> proxy = New List <proxy> ();
For ( Int I = 0 ; I <tdnodes. Count/ 4 ; I ++)
{
String Region = tdnodes [I * 4 + 3 ]. Innertext. Trim ();
If ( String . Isnullorempty (regionfilter) | region = regionfilter)
{
Proxy P = New Proxy ();
P. IP = tdnodes [I * 4 + 1 ]. Innertext. Trim ();;
P. Port = Int . Parse (tdnodes [I * 4 + 2 ]. Innertext. Trim ());
P. region = region;
Proxy. Add (P );
}
}
Return Proxy;
}
This method has also been encapsulated into proxyfinder and put into aliasnet. On the one hand, it can be used as an example of aliasnet, and on the other hand it can be considered an extra benefit. Anyway, I have used a lot of proxies.
Summary:
Aliasnet is now hosted on GitHub and fully open-source, interested can visit: https://github.com/superliujian/AliasNet or contact me: liu_jian_china@qq.com
In addition, we will announce that the simple and adaptive lightweight database access library aliasdb is also open-source and will be released soon. Aliasdb allows you to provide only one connection string to access MySQL, sqlserver, Oracle, SQLite, and all databases that support ODBC/oledb (such as access, when the structure of your database changes, such as adding or deleting fields, the access is still valid, and you do not need to use any Orm, or write SQL statements or write database entities yourself. This is useful in small and medium systems.