C # reference SHDocVw to simulate webpage operations,
Due to the need of recent projects, web page crawling is exposed.
Some simple web pages are exposed at the initial stage. After packet capture analysis by Fiddler, you can simulate http requests and perform the desired operations.
Some complex web pages are exposed later. It is difficult to implement a set of simulated logon methods, and the website is encrypted. It cannot be analyzed after the Fiddler captures packets. Therefore, the webbrower control is used and users can log on manually, after logging on, the program automatically completes subsequent operations.
- 3. SHDocVw. InternetExplorer
Later, some problems were solved by webbrower. (After clicking the button, there was no correct response. I don't know whether it was the cause of iframe or the cross-origin js issue.) This was found on the website.
There are a lot of such searches on the Internet, and some basic operations are mentioned, such as getting IE, opening a specified URL, obtaining a control, executing a control click, and executing JS.
However, to achieve automation, the most important thing is to judge whether the current page has been loaded, which is rarely mentioned on the Internet. The following are some methods to collect on the Internet:
It is determined that ReadyState = tagREADYSTATE. READYSTATE_COMPLETE is completed.
However, in actual use, the ReadyState will not change in some cases (for example, when the form of some pages is submitted for query), and The status is always complete. Therefore, this judgment is inaccurate.
JudgmentStatusTextWhether to include "completed". If yes, the page is loaded completely. If no content is included, the page is not loaded completely.
When the page is loaded, the DocumentComplete event is triggered, so we can set the semaphore through DocumentComplete.