WebBrowser multithreading brings troubles

Source: Internet
Author: User

When we were doing collection software

It is very troublesome for some websites to analyze html text directly.

When WinForm is used for programming

There is a better way, of course, to analyze HtmlDocument

However, this HtmlDoucment cannot be created directly.

It must be generated by the WebBroswer control Navigate after a page

To obtain wb. HtmlDocument

Then, you can analyze the elements and labels of HtmlDocument.

 

In fact

Not only a single page is collected

In this way, it can be completed in the main form

 

For example, to collect some list pages, there are N multiple pages

So, a loop goes down,

Using WebBrowser to respond will lead to false positives

At this time, we will certainly think of using multithreading to do this.

 

C # multithreading,

We should all know that there are two modes: STA and MTA.

However, the WebBrowser control has a bad feature.

That is: it only supports multi-thread STA Mode

For example, the following code,

Thread tread = new Thread (new ParameterizedThreadStart (BeginCatch ));
Tread. SetApartmentState (ApartmentState. STA );
Tread. Start (url );

 

 

Code Private void BeginCatch (object obj)
{
String url = obj. ToString ();
WebBrowser wb = new WebBrowser ();
Wb. ScriptErrorsSuppressed = true;
Wb. Navigate (url );
Wb. DocumentCompleted + = new WebBrowserDocumentCompletedEventHandler (wb_DocumentCompleted );
}

 

To analyze HtmlDocument generated by WebBrowser, you must perform operations in the event DocumentCompleted.

WebBrowser is loaded only at this time.

 

However, this is just a trap !!!!

WebBrowser has a feature, that is, when the multi-thread STA is used, it simply does not wait for the execution of DocumentCompleted.

That is, subsequent operations cannot be performed !!!

 

In this case, what should we do?

Someone may think of the wb. Document. Write (string) method as follows:

 

 

Code Private void BeginCatch (object obj)
{
String url = obj. ToString ();
WebBrowser wb = new WebBrowser ();
Wb. ScriptErrorsSuppressed = true; string htmlcode = GetHtmlSource (url );
Wb. Document. Write (htmlcode );
// Perform the analysis operation
}
// Obtain the web page source code from WebClient
Private string GetHtmlSource (string Url)
{
String text1 = "";
Try
{
System. Net. WebClient wc = new WebClient ();
Text1 = wc. DownloadString (Url );
}
Catch (Exception exception1)
{}
R

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.