A common method for capturing data from a webpage: capturing data from a webpage
First, you need to understand the operation mechanism of the webpage of the other party. You can use httpwcoo or httplook to view the data sent and received by http. These two tools are easy to understand. I will not introduce it here. The main focus is the header and post content. It generally includes cookies, Referer pages, and other variables that may be hard to understand, as well as parameters for normal interaction, such as those contained in querystring that requires post or get.
There are a lot of downloads from httplook and httpwdrm. httpwach is recommended here, because it can be directly embedded into ie. I personally think this is better. These two tools can be downloaded from the resources I uploaded to csdn. The address is
Http://download.csdn.net/user/jinjazz
Here is a piece of c # code that can capture data. For example, you can log on to a website and obtain the html code after Successful Logon for data analysis.
Private void login () {System. net. webClient wb = new System. net. webClient (); System. collections. specialized. nameValueCollection header = new System. collections. specialized. nameValueCollection (); header. add ("Cookie", "czJ_cookietime = 2592000; czJ_onlineusernum = 1651; czJ_sid = w4bGJd"); header. add ("Referer", @ "http://hovertree.net/bbs/login.php"); wb. headers. add (header); System. collections. specialized. nameValueCollection data = new System. collections. specialized. nameValueCollection (); data. add ("formhash", "ebd2faac"); data. add ("referer", "http://hovertree.net/bbs/search.php"); data. add ("loginfield", "username"); data. add ("username", "jinjazz"); data. add ("password", "999"); data. add ("questionid", "0"); data. add ("answer", ""); data. add ("cookietime", "2592000"); data. add ("loginmode", ""); data. add ("styleid", ""); data. add ("loginsubmit", "Submit"); byte [] B = wb. uploadValues ("http://hovertree.net/bbs/login.php", "Post", data); string strData = System. text. encoding. default. getString (B); Console. writeLine (strData );}
The above Code contains all the data except the three URLs. The parameters and values in header and data are monitored by httpwatch.