From: http://www.360doc.com/content/10/1008/10/3722251_59268989.shtml
The core of Automatic Webpage logon and post submission is to analyze the webpageSource code(HTML). in C #, there are many HTML components that can be used to extract web pages. Generally, webbrowser, WebClient, and httpwebrequest are used.
These three methods are used as follows:
1. webbrowser is a "mini" browser, which features no cookie or built-in JS issues during post.
Webbrowser is a new component provided by vs2005 (in fact, it encapsulates the IE interface). The post function is generally implemented by analyzing htmldocument in documentcompleted of webbrowser,CodeAs follows:
Htmlelement clickbtn = NULL;
If (E. url. tostring (). tolower (). indexof (" Http://sandou.cnblogs.com/ ")> 0) // logon page
{
Htmldocument Doc = webbrowser1.document;
For (INT I = 0; I <Doc. All. Count; I ++)
{
If (Doc. All [I]. tagname. toupper (). Equals ("input "))
{
Switch (Doc. All [I]. Name)
{
Case "userctl ":
Doc. All [I]. innertext = "user01 ";
Break;
Case "passct1 ":
Doc. All [I]. innertext = "mypass ";
Break;
Case "B1 ":
Clickbtn = Doc. All [I]; // submit button
Break;
}
}
}
Clickbtn. invokemember ("click"); // click the button
}
2. WebClient encapsulates some HTTP classes and has simple operations. Compared with webbrowser, WebClient features self-configured proxy. Its disadvantage is Cookie control.
The WebClient runs in the background and provides the ability to perform asynchronous operations. This facilitates concurrent tasks and then waits for the results to be returned and then processes them one by one. The code for asynchronous multi-task invocation is as follows:
Private void startloop (INT proxynum)
{
WebClient [] wcarray = new WebClient [proxynum]; // Initialization
For (INT idarray = 0; idarray <proxynum; idarray ++)
{
Wcarray [idarray] = new WebClient ();
Wcarray [idarray]. openreadcompleted + = new openreadcompletedeventhandler (pic_openreadcompleted2 );
Wcarray [idarray]. uploaddatacompleted + = new uploaddatacompletedeventhandler (pic_uploaddatacompleted2 );
Try
{
Wcarray [idarray]. Proxy = new WebProxy (proxy [1], Port );
Wcarray [idarray]. openreadasync (New uri (" Http://sandou.cnblogs.com/ "); // Open the web;
Proxy = NULL;
}
Catch
{
}
}
}
Private void pic_openreadcompleted2 (Object sender, openreadcompletedeventargs E)
{
If (E. Error = NULL)
{
String textdata = new streamreader (E. Result, encoding. Default). readtoend (); // obtain the returned information
..
String cookie = (WebClient) sender). responseheaders ["Set-cookie"];
(WebClient) sender). headers. Add ("Content-Type", "application/X-WWW-form-urlencoded ");
(WebClient) sender). headers. Add ("Accept-language", "ZH-CN ");
(WebClient) sender). headers. Add ("cookie", cookie );
string postdata = ""
byte [] bytearray = encoding. utf8.getbytes (postdata); // convert to a binary array
(WebClient) sender ). uploaddataasync (New uri (" http://sandou.cnblogs.com/"), "Post", bytearray );
}< BR >}
Private void pic_uploaddatacompleted2 (Object sender, uploaddatacompletedeventargs E)
{
If (E. Error = NULL)
{
String returnmessage = encoding. Default. getstring (E. Result );
}
}
3. httpwebrequest is relatively low-level and can implement many functions. Cookie operations are also simple:
Private bool postwebrequest ()
{
Cookiecontainer cc = new cookiecontainer ();
String POS tdata = "user =" + struser + "& pass =" + strpsd;
Byte [] bytearray = encoding. utf8.getbytes (postdata); // convert
Httpwebrequest webrequest2 = (httpwebrequest) webrequest. Create (New uri ( Http://sandou.cnblogs.com/ ));
Webrequest2.cookiecontainer = cc;
Webrequest2.method = "Post ";
Webrequest2.contenttype = "application/X-WWW-form-urlencoded ";
Webrequest2.contentlength = bytearray. length;
Stream newstream = webrequest2.getrequeststream ();
// Send the data.
Newstream. Write (bytearray, 0, bytearray. Length); // write Parameters
Newstream. Close ();
Httpwebresponse response2 = (httpwebresponse) webrequest2.getresponse ();
Streamreader Sr2 = new streamreader (response2.getresponsestream (), encoding. Default );
String text2 = sr2.readtoend ();
}
Httpwebrequest implementation. This is copied from the internet! I used the relevant code to log onWww. ASP. NETAnd post successfully, but the Code does not know where to put it.
Httpwebrequest automatically logs on to the website and obtains the website content (websites without verification codes)
You can use visual sniffer (Baidu search) to capture submitted data information:
1. Access the pages you need to submit outside the site, such as the csdn login pageHttp://www.csdn.net/member/UserLogin.aspx
2. Fill in the required information, such as the user name and password,
3. Open visual sniffer and click "Start interception"
4. Submit it on the accessed page.
5. After the submission is successful, stop blocking in visual sniffer"
6. Click the plus sign in the left column of visual sniffer, And the content intercepted on the right is as follows:
Post Http://www.csdn.net/member/UserLogin.aspx HTTP/1.0
Accept: image/GIF, image/X-xbitmap, image/JPEG, image/pjpeg, application/vnd. MS-Excel, application/vnd. MS-PowerPoint, application/MSWord, application/X-Shockwave-flash ,**
Referer: Http://www.csdn.net/member/UserLogin.aspx
Accept-language: ZH-CN
UA-CPU: x86
Pragma: No-Cache
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; sv1;. Net CLR 1.1.4322; infopath.1)
Host: Www.csdn.net
Proxy-connection: keep-alive
COOKIE: aspsessionidaaaatbqc = fmeggckdbkhammcgkpfdmbfg; ASP. net_sessionid = lusprmnom05lr445tmteaf55; userid = 699879
The above is the interception content, where the parameter section of the submitted data ( Program In: strargs) such:
_ Eventtarget = & __ eventargument = & __ viewstate = ddwtmtcwmzgxnjq2mjs7bdxdu
%% 2btu
1q2wmrzoajti9l73w1zbleyly % 3d & csdnuserlogin % 3atb_username = testusername & csdn
Userlogin % 3atb_password = testpassword & csdnuserlogin % 3atb_expwd = 9232
Protected static string cookieheader;
Private void page_load (Object sender, system. eventargs E)
{
String strrecontent = string. empty;
// Log on
Strrecontent = postlogin (" Http://www.mystand.com.cn/login/submit.jsp Submitted page "," submitted parameters: userid = hgj0000 & Password = 06045369 "," reference address:Http://www.mystand.com.cn/ ");
// Note the parameters passed during Asp.net Login
// Strrecontent = postlogin ("47 bdxczts % 2 bpjs % 2 bozs % 2boz4% signature % 2fl26iw % 3d % 3d & txtusername = hxf & txtpassword = hxf0000 & btnenter = % E7 % 99% BB % E5 % BD % 95 ", http://www.mystand.com.cn/login.aspx> Http://www.mystand.com.cn/login.aspx ", "_ viewstate = bytes % 2 bpjs % 2 bozs % 2boz4% bytes % 2bkbnpsjd7op % 2fl26iw % 3d % 3d & txtusername = hxf & txtpassword = hxf0000 & btnenter = % E7 99% BB % E5 % BD % 95 ", http://www.mystand.com.cn/login.aspx ");
// Obtain the page
Strrecontent = getpage (" Http://www.mystand.com.cn/company/getdata.jsp? Code = "," Reference address: Http://www.mystand.com.cn/ ");
// Strrecontent = getpage (" Http://www.mystand.com.cn/Modules/index.aspx, http://www.mystand.com.cn/login.aspx ");
// Process the obtained content: strrecontent
}
/// <Summary>
/// Function Description: Simulate the logon page, submit the logon data for logon, and record the cookie in the header
/// </Summary>
/// <Param name = "strurl"> address of the logon data submission page </param>
/// <Param name = "strargs"> User Logon data </param>
/// <Param name = "strreferer"> reference address </param>
/// <Returns> you can return the page content or not </returns>
Public static string postlogin (string strurl, string strargs, string strreferer)
{
String strresult = "";
Httpwebrequest myhttpwebrequest = (httpwebrequest) webrequest. Create (strurl );
Myhttpwebrequest. allowautoredirect = true;
Myhttpwebrequest. keepalive = true;
Myhttpwebrequest. accept = "image/GIF, image/X-xbitmap, image/JPEG, image/pjpeg, application/vnd. MS-Excel, application/MSWord, application/X-Shockwave-flash, * /// <summary>
/// Function Description: After successfully logging on to postlogin, record the cookie in headers and obtain the content on other pages on this website.
/// </Summary>
/// <Param name = "strurl"> obtain the address of a page on the website. </param>
/// <Param name = "strreferer"> referenced address </param>
/// <Returns> return page content </returns>
Public static string getpage (string strurl, string strreferer)
{
String strresult = "";
Httpwebrequest myhttpwebrequest = (httpwebrequest) webrequest. Create (strurl );
Myhttpwebrequest. contenttype = "text/html ";
Myhttpwebrequest. method = "get ";
Myhttpwebrequest. Referer = strreferer;
Myhttpwebrequest. headers. Add ("Cookie:" + cookieheader );
Httpwebresponse response = NULL;
System. Io. streamreader sr = NULL;
Response = (httpwebresponse) myhttpwebrequest. getresponse ();
Sr = new system. Io. streamreader (response. getresponsestream (), encoding. getencoding ("gb2312"); // UTF-8
Strresult = Sr. readtoend ();
Return strresult;
}
Technical Application-Web Page Automatic Logon (submit post content) is used for many purposes, such as identity verification, program upgrade, and online voting. The following method is implemented using C.
unsolved problem -- the Verification Code cannot be bypassed for the current biggest problem -- I used to discuss the algorithm with my colleagues, it is basically difficult to identify. There are also a lot of examples on the Internet to identify verification codes, but it is still possible for Simple noise, but it is useless for complicated ones! So far, I have not passed the test! If you have passed the test, please post the code for our research.