Superwebclient-A curl-based. NET Http/https analog God component (2)

Source: Internet
Author: User

Today we discuss several simple topics in the use of superwebclient components

1:useragent
2:cookies
3:post Login


1:useragent
This is the client identity information, which is used to identify the type of client that is accessing the Web service, and below we get a few typical browsers by grabbing the package tool fiddler
How this information is going to be. Fiddler is a. NET open-source HTTP/HTTPS protocol analysis software that was later acquired without open source.
Fiddler Http://www.telerik.com/fiddler, after running as follows

First we open IE browser-enter the blog Park website, you will see Fiddler caught a lot of packets, we choose the first page of the article can be viewed specific HTTP request information

We can know the useragent information of IE browser as follows
user-agent:mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; trident/4.0; SLCC2;. NET CLR 2.0.50727;. NET CLR 3.5.30729;. NET CLR 3.0.30729;. net4.0c;. NET4.0E)
(the version of IE on the machine is not the same as the software installed, User-agent will be slightly different)
We are looking at a Google Chrome based browser similar to the following
user-agent:mozilla/5.0 (Windows NT 6.3; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/48.0.2564.48 safari/537.36
There are many Cheetahs, Firefox, Travel and others have their own additional information in the user-agent, and that's what might be used.
in general, the Web server does not value this information, but some will check, and according to different clients to give the appropriate output feedback, and the client JS script will be based on the
this information to do some compatibility check and adjustment.
         for crawlers such as Baidu, Google or others have their own unique useragent information, superwebclient default simulation is IE browser, of course, you can fully set
Simulate any information, such as the following Superwebclient application code example below
Note the section of the red area of code, which was originally discussed as a later topic, Now is to demonstrate how to use fiddler crawl to Superwebclient request package will be marked in advance
is simulated by the Internet Explorer, the following I modify this user-agent information for "i-love-you"

    1. Start by building a Hi object-that is, an input object for analog access
    2. HttpInput hi = new HttpInput ();
    3. Need to initialize this hi, set whether HTTP 1.1 is enabled, followed by connection and transfer timeout (seconds)
    4. HttpManager.Instance.InitWebClient (Hi, True, 60, 60);
Copy Code

//proxy settings-otherwise fiddler cannot be crawled
hi. EnableProxy = true;
hi. Proxyip = "127.0.0.1";
hi. ProxyPort = 8888;

hi. useragent = "I-love-you";
The next step is to set the URL of the access, and other such as useragent,cookies,proxy .... And so on. Settings for various features
In general, we set the URL access path, the other is when necessary to set up the

    1. Hi. URL = "http://www.cnblogs.com/";
    2. There is an input object, there is a corresponding output object-Note that this method is blocked, only to the results of feedback back
    3. So, you can put the code in a separate thread or thread pool for data collection.
    4. Httpoutput ho = HttpManager.Instance.ProcessRequest (HI);
    5. if (ho. IsOK)
    6. {
    7. If Ho's flag isOK is established, page access is successful, otherwise page access fails
    8. Encounter Failure-You can do your own processing, such as re-initiating a request or writing a log
    9. richTextBox1.Text = ho. Txtdata;
    10. }
    11. Else
    12. {
    13. richTextBox1.Text = "Page access error";
    14. }
    15. Finally destroy the object
    16. Hi. Dispose ();

You can pretend to be anything you need, such as the unique information of Momo.
2:cookie
Superwebclient default processing cookies sent by the Web server, without your own brain, the cookie information sent from the server how to read it?

Httpoutput ho = HttpManager.Instance.ProcessRequest (hi);            if (ho. IsOK)            {                richtextbox2.text = ho. Cookies;                If Ho's flag isOK is established, then page access is successful, otherwise page access fails                //encounters failure-you can do the follow-up process yourself, such as re-initiating the request or writing log                richTextBox1.Text = ho. txtdata;            }

In the output object of HO. Cookies in the properties of the inside OH.

For example, visit the C # ant nest http://www.csharpworker.com/

The cookies received are
Htq9_2132_pc_size_c=0;htq9_2132_onlineusernum=2;htq9_2132_lastact=1488261049%09forum.php%09;htq9_2132_sid= MROJS6;HTQ9_2132_LASTVISIT=1488257449;HTQ9_2132_SALTKEY=P7EGLLZG;
The Web server will send different cookies on different requests, superwebclient the output object Httpoutput only when the server sends the cookie information, does not accumulate, Although the superwebclient internally accumulates all cookies sent from the server and sends them back as they are submitted to the server, there is currently no way to read all accumulated cookies. You can collect the cookie information of each requested feedback object Httpoutput, in general, I do not care about this thing, unless you are to save the cookie to do it with.


3:post Login
In general, if there is no accident, many of the Web services operation requests are based on the post protocol, Ajax is also very rarely based on the get protocol, because the request string is a length-limited
A large number of parameters need to be sent via post, and then we'll see how superwebclient can log in to a website by post, and we need to analyze the site before we show it.
Request protocol, and then use superwebclient for impersonation.
We will log into the blog park process to explain, first we use Fiddler to crawl the blog Park login Agreement, note, if you do not crawl the blog Park login agreement, please see the following 2 fiddler for the two most common problems, because the blog park login is the use of HTTPS, So you need to turn on the corresponding HTTPS decoding option in Fiddler.
Click Tool->tererik FIDDLER OPTIONS in the FIDDLER menu

If you encounter a situation that can be crawled but found to be unable to decode, it is your certificate problem, certificate problem-can refer to Http://blog.csdn.net/chjqxxxx/article/details/53666175  Here-rebuild the certificate-the problem will be resolved

Here is the login protocol for the blog Park

POST Https://passport.cnblogs.com/user/signin Http/1.1x-requested-with:xmlhttprequestaccept-language: Zh-cnreferer:https://passport.cnblogs.com/user/signin? Returnurl=http%3a%2f%2fwww.cnblogs.com%2faccept:application/json, Text/javascript, */*; Q=0.01verificationtoken: _q9b84azuznutiffmygxm8poneanxanp7gw7sqep3b3lqk-opvj5wpgean6sm6ln1qfi6y8-36gayjmdyybuf_ Cy6ie1:fmsuvfj_iigbnmv7n3tydhuuc8orliknw7escsgqwv2bx7lm64bnomv0x3s-t2-rldcbpfuynk3_in_ Fmntfaphe3cw1content-type:application/json; Charset=utf-8accept-encoding:gzip, deflateuser-agent:mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; trident/4.0; SLCC2;. NET CLR 2.0.50727;. NET CLR 3.5.30729;. NET CLR 3.0.30729;. net4.0c;. NET4.0E) Host:passport.cnblogs.comcontent-length:386connection:keep-alivecache-control:no-cachecookie:hm_lvt_ cc17b07fc9529e3d80b4482c9ce09086=1482645471; _ga=ga1.2.806287383.1482645352; __GADS=ID=CDE93307386572B7:T=1488262791:S=ALNI_MBB1OIDAPHUYT-REWWV3UYFQG8-AA; Aspxautodetectcookiesupport=1; ServerID=227b087667da6f8e99a1165002db93f6|1488285061|1488285061{"INPUT1": "a9q+6m4ewc0hlhafyl8gyvd5zcfp+tumv+f2p+ af0fdzrty9nqpdkjcdctpqly37ei/vzplc9z+sakxawpwa0ydum3usn+b7w+m6t7p8qrs+nuhu1g6wj34iyuaruq5ge3+ R30pky65q9dd0f5ruosziziygcvm2/wrqfqsdyiq= "," Input2 ":" Fgoieueg5hp6tt3c/q9z0dulyc9qeq0gijqytgnhtl/scriromr2g8gtn +aurdfzh/ddzmoyrbfej97n454gg8ppd5cre3qanbtpebcjrik3chv/mupe+ Lifveu6mwogpqfbrm6gqcszbhavfrphbkyrqkeawrz6itsjkcbkpqc= "," Remember ": false}

The

account and password are encrypted processing, as for the encryption process, you need to carefully analyze the JS code of the blog park, here we are only to show how to use Superwebclient for the
Post form login, so for the account, password encryption analysis skipped, If there is a special free later, I will add an article analysis of the Blog Park account, password encryption, mainly for the relevant
JS code debugging and analysis.

Then use superwebclient to log in to the blog park. Very simple.

HttpInput hi = new HttpInput ();            HttpManager.Instance.InitWebClient (Hi, True, 60, 60); Here set up blog Park login URL hi.            URL = "Https://passport.cnblogs.com/user/signin"; Hi.             Refer = "Https://passport.cnblogs.com/user/signin?ReturnUrl=http%3A%2F%2Fwww.cnblogs.com%2F"; We forged down the road, proving where we came from, according to the Protocol/Post data hi. PostData = "Input1=a9q+6m4ewc0hlhafyl8gyvd5zcfp+tumv+f2p+af0fdzrty9nqpdkjcdctpqly37ei/vzp" + "Lc9z+SAKXaWpWa0yDU m3usn+b7w+m6t7p8qrs+nuhu1g6wj34iyuaruq5ge3+r30pky65q9dd0f5ruosziziygcvm2/wrqfqsdyiq= "+" &input2=fGOIEUEG5HP 6tt3c/q9z0dulyc9qeq0gijqytgnhtl/scriromr2g8gtn+aurdfzh/ddzmoyrbfej97n454gg8ppd5cre3qa "+" NBTPebCJrIK3chv/Mupe+l            Ifveu6mwogpqfbrm6gqcszbhavfrphbkyrqkeawrz6itsjkcbkpqc= "+" &remember=false ";            Httpoutput ho = HttpManager.Instance.ProcessRequest (HI); if (ho. IsOK) {richTextBox1.Text = ho. TxtdatA            } else {richtextbox1.text = "page access error"; }//finally Destroy object hi. Dispose ();

Hi. Postdata= the format of "Set up post form data" is: key1=value1&key2=value2 ..... In turn several

Today's theme is over, with some of today's things, you've been able to handle most of the mock requests, and most of Superwebclient's features are around the current get,post service.

Superwebclient-A curl-based. NET Http/https analog God component (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.