What is Httphelper?
Httpelpers is a tool class that encapsulates the resources that are available for use on the network. Because it is using the HTTP protocol, so named Httphelper.
Httphelper appearance of the background
Using WebClient makes it easy to get resources on the network, such as
New WebClient (); string html= client. Downloadstring ("https://www.baidu.com/");
This can get Baidu home source code, because WebClient packaging is too strong, sometimes not flexible, need to have more detailed control of the bottom, this time need to build their own network resources to obtain tools;
Httphelper Primary
Now set out to build your own download tool, just at the beginning of this
Public class Httphelp { publicstaticstring downloadstring (string url) {
String. Empty;
HttpWebRequest Request=(HttpWebRequest) webrequest.create (URL);
using(HttpWebResponse response =(HttpWebResponse) Request. GetResponse ())
{
using(Stream stream = response. GetResponseStream ())
{
using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8))
{
Source=Reader. ReadToEnd ();
}
}
}
return Source;
}
}
The program always has a variety of exceptions, this time add a try Catch statement
Public classHttphelp { Public Static stringDownloadstring (stringURL) { stringSource =string. Empty; Try{HttpWebRequest Request=(HttpWebRequest) webrequest.create (URL); using(HttpWebResponse response =(HttpWebResponse) request. GetResponse ()) {using(Stream stream =Response. GetResponseStream ()) {using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8)) {Source=Reader. ReadToEnd (); } } } } catch
{Console.WriteLine ("error, the requested URL is {0}", URL); } returnSource; } }
Requesting resources is I/O intensive, especially time-consuming, and this time requires asynchronous
Public Static Asynctask<string> downloadstring (stringURL) { return awaittask<string. Run (() = { stringSource =string. Empty; Try{HttpWebRequest Request=(HttpWebRequest) webrequest.create (URL); using(HttpWebResponse response =(HttpWebResponse) request. GetResponse ()) {using(Stream stream =Response. GetResponseStream ()) {using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8)) {Source=Reader. ReadToEnd (); } } } } Catch{Console.WriteLine ("error, the requested URL is {0}", URL); } returnSource; }); }
Httphelper Perfect
In order to deceive the server, let the server think that this request is issued by the browser
" mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) gecko/20100101 firefox/49.0";
Some resources need permission, this time to disguise as a user, the HTTP protocol is stateless, the tag information is on the cookie, the request with a cookie
Request. Headers.add ("cookie" ) , place a cookie here, copy it from thebrowser)
Perfect, set a timeout.
Request. Timeout = 5;
Some Web sites provide resources that are gzip compressed, which saves bandwidth, so the request header is added to the requests. Headers.add ("accept-encoding", "gzip, deflate, BR"), corresponding to the corresponding flow to have the corresponding decompression, this time httphelper become like this
public static string downloadstring (string URL)
{
string Source = string. Empty;
try{
HttpWebRequest request =(HttpWebRequest) webrequest.create (URL); Request. UserAgent="mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) gecko/20100101 firefox/49.0"; request. Headers.add ("Cookies","Here is the cookie");request. Headers.add ("accept-encoding","gzip, deflate, BR"); Request. KeepAlive=true;//Enable Long connections using(HttpWebResponse response =(HttpWebResponse) request. GetResponse ()) {using(Stream DataStream =Response. GetResponseStream ()) {if(Response. Contentencoding.tolower (). Contains ("gzip"))//Unzip { using(GZipStream stream =NewGZipStream (response. GetResponseStream (), compressionmode.decompress)) {using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8)) {Source=Reader. ReadToEnd (); } } } Else if(Response. Contentencoding.tolower (). Contains ("deflate"))//Unzip { using(Deflatestream stream =NewDeflatestream (response. GetResponseStream (), compressionmode.decompress)) {using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8)) {Source=Reader. ReadToEnd (); } } } Else { using(Stream stream = response. GetResponseStream ())//Original { using(StreamReader reader =NewStreamReader (Stream, Encoding.UTF8)) {Source=Reader. ReadToEnd (); }}}}} request. Abort (); } Catch{Console.WriteLine ("error, the requested URL is {0}", URL); } returnSource;
}
The request attitude will be rejected by the server, returning 429. This time the agent needs to be set, our request will be submitted to the proxy server, the proxy server will request to the target server, the resulting response is returned to us by the proxy server. As long as the proxy is constantly switched, the server will not refuse to request a program because the request is too frequent
var New WebProxy ("Adress",8080); // followed by the port number Request. Proxy = proxy; // set up a proxy for HttpWebRequest
As for how to get the agent, please see the following blog
An iterative approach to capturing the httphelper of millions of user information