WebBrowser control (personal diary, I am also a newbie), webbrowser Control
I know two types of Web Simulation
1: Get the page through HttpRequest's Get/Post submission method
2: Use the C # webBrowser control to simulate page clicks
2nd types of encryption algorithms are available on the page, or 1st types of random parameters are recommended for simple web pages!
WebBrowser usage
In the Lode event
Private void Form1_Load (object sender, EventArgs e)
{
WebBrowser1.Navigate ("page URL ");
}
After webBrowser1 is enabled, DocumentCompleted is triggered when the page is displayed.
Private void webbrowserappsdocumentcompleted (objectsender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser1.Document. GetElementById ("tag Id"). InnerText = "What characters do I need to fill in"; // The logon name is automatically filled in
HtmlDocument doca = this. webBrowser1.Document; // instantiate the document displayed in the current webBrowser1 into an HtmlDocument object
For (int I = 0; I <doca. All. Count; I ++) // cyclically finds every element of this object
{
If (doca. All [I]. TagName = "BUTTON") // if the element is
{
HtmlElement myelement = doca. All [I]; // instantiate this element into an HtmlElement object.
If (myelement. OuterText = "login" & myelement. Id = "J_SubmitStatic") // if the text of this element is "next page"
{
Myelement. InvokeMember ("click"); // click this element
Break;
}
}
In the DocumentCompleted event above, find the BUTTON whose Id is equal to "J_SubmitStatic" and whose character is "login" by traversing the BUTTON tag in the webpage source code. Click
At this time, the page automatically simulates the click special effect to jump to the page.
Many people obtain the page source code through webBrowser1.Document, but find that the page source code is always the page source code before logon.
I have been stuck here for a long time. Many smart people know that there will be independent events to get the code after the jump. Test the webBrowser1 Navigated event triggered when you navigate to the new document. Use webBrowser1.Document to obtain the source code after the redirection.
Private void webBrowser1_Navigated (objectsender, WebBrowserNavigatedEventArgs e)
{
String yuanma = webBrowser1.Document; // Save the page source code to a string
}
Many people can achieve the basic requirements here. After all, webpage jumps are basically based on the replacement of URLs.
However, the problem is that many well-known web pages are hidden in Javascript, and you cannot find them simply by traversing the tag through the source code and clicking the tag on the page.
At this time, we need to know the role of cookies in the browser. Anyone who has developed a web page knows that after logging on to the page, the control of the transfer permission on the page is based on a cookie. When we log on, it is equivalent to getting an ID card on the page for unobstructed browsing. It is based on this identity authentication.
After knowing this, will the webBrowser control record our cookies?
Let's start the test:
Int count = 0;
Private void Form1_Load (object sender, EventArgs e)
{
If (count = 0)
{
WebBrowser1.Navigate ("Logon page ");
}
Else if (count = 1)
{
WebBrowser1.Navigate ("Any hyperlink that can be viewed after Logon ");
Count = 2;
}
}
Private void webbrowserappsdocumentcompleted (objectsender, WebBrowserDocumentCompletedEventArgs e)
{
If (webBrowser1.ReadyState = WebBrowserReadyState. Complete) // determine whether the load is Complete.
{
WebBrowser1.Document. GetElementById ("tag Id"). InnerText = "What characters do I need to fill in"; // The logon name is automatically filled in
HtmlDocument doca = this. webBrowser1.Document; // instantiate the document displayed in the current webBrowser1 into an HtmlDocument object
For (int I = 0; I <doca. All. Count; I ++) // cyclically finds every element of this object
{
If (doca. All [I]. TagName = "BUTTON") // if the element is
{
HtmlElement myelement = doca. All [I]; // instantiate this element into an HtmlElement object.
If (myelement. OuterText = "login" & myelement. Id = "J_SubmitStatic") // if the text of this element is "next page"
{
Myelement. InvokeMember ("click"); // click this element
Break;
}
}
}
Private void webBrowser1_Navigated (objectsender, WebBrowserNavigatedEventArgs e)
{
Stringyuanma = webBrowser1.Document; // Save the page source code to a string
If (yuanma. Contains ("a unique field displayed on the login success page") // determines whether the string is used in the source code of the logon success page to confirm that the logon is successful.
{
Count = 1; // After Successful Logon, the global variable is changed to 1.
// Set a timer to trigger at intervals. The timer cannot be used by Baidu!
System. Timers. Timer timer = new System. Timers. Timer (3000); // instantiate the Timer class. Set the interval to 1000 milliseconds to 1 second ;.
Timer. Enabled = true;
Timer. Start ();
Timer. Elapsed + = new System. Timers. ElapsedEventHandler (Form1_Load); // triggers the Load event
}
}
Private void timerincluelapsed (object sender, System. Timers. ElapsedEventArgs e)
{
Application. Restart ();
}
Because the count value of the global variable changes, when the Load event is triggered
If (count = 0)
{
WebBrowser1.Navigate ("Logon page ");
}
Else if (count = 1)
{
WebBrowser1.Navigate ("Any hyperlink that can be viewed after Logon ");
Count = 2;
}
The webBrowser control loads the link of webBrowser1.Navigate ("Any hyperlink you can view after Logon. If a cookie exists, you can directly jump to the cookie because you have the permission.
The test shows that the webBrowser control has the ability to remember cookies, so we can get the source code of the page at will. (A lot of people say what to do with the source code, because my need to do this is to obtain the Real-Time Storage and search of webpage information)
Many people think that this is basically no problem now, and the function of getting source code has been completed. You only need to add a judgment on the source code obtained in the Navigated event to get the desired source code accurately. I think so too, but there will always be exceptions. I want some famous webpages whose pages have static refresh effects and URLs remain unchanged for redirection, I am not very clear about the specific situation, but I suddenly failed to get the source code. I found that the source code obtained through breakpoint debugging was incomplete and I could not find the information I needed.
Navigated is triggered when you navigate to a new document, so I personally think it may be because the trigger condition is not met or the time has not been completed, leading to the early running, so the source code obtained is incomplete, however, no change can achieve the desired effect. Finally, I suddenly thought that the source code we obtained in the DocumentCompleted event was always the link contained in webBrowser1.Navigate (""). I changed webBrowser1.Navigate ("") by triggering the load event ("") after loading the URL, check whether the source code of the URL can be obtained.
It can be implemented by changing the count Time trigger and judgment. The basic requirements have been met!
Private void webbrowserappsdocumentcompleted (object sender, WebBrowserDocumentCompletedEventArgs e)
{
If (webBrowser1.ReadyState = WebBrowserReadyState. Complete)
{
Encoding encoding = Encoding. GetEncoding (webBrowser1.Document. Encoding );
StreamReader stream = new StreamReader (webBrowser1.DocumentStream, encoding );
String ss = stream. ReadToEnd ();
If (ss. Contains ("J_Submit "))
{
WebBrowser1.ScriptErrorsSuppressed = true;
WebBrowser1.Document. GetElementById ("TPL_username_1"). InnerText = "Account ";
WebBrowser1.Document. GetElementById ("TPL_password_1"). Focus ();
WebBrowser1.Document. GetElementById ("TPL_password_1"). InnerText = "password"
HtmlDocument doca = this. webBrowser1.Document; // instantiate the document displayed in the current webBrowser1 into an HtmlDocument object
For (int I = 0; I <doca. All. Count; I ++) // cyclically finds every element of this object
{
If (doca. All [I]. TagName = "BUTTON") // if the element is
{
HtmlElement myelement = doca. All [I]; // instantiate this element into an HtmlElement object.
If (myelement. OuterText = "login" & myelement. Id = "J_SubmitStatic") // if the text of this element is "next page"
{
Myelement. InvokeMember ("click"); // click this element
Break;
}
}
Else if (ss. Contains ("order information "))
{
Ss = ss. split (new string [] {"include field"}, StringSplitOptions. none) [1]. split (new string [] {"include field"}, StringSplitOptions. none) [0]; // intercept the page with too many contents
Ss. Replace ("\ t", "\ n ");
StreamWriter sw = new StreamWriter ("a.txt", true, System. Text. Encoding. Unicode );
Sw. WriteLine (DateTime. Now + ":");
Sw. WriteLine (ss );
Sw. WriteLine ("succeeded ("------------------------------------------------------------------------------------------------------------------------");
Sw. Close ();
}
}
}