Data Solution for curl crawling educational administration system
Source: Internet
Author: User
Curl crawls the data of the educational administration system. you have come to consult with us again. I have encountered this situation again. The code is like this. & lt ;? Phpheader & nbsp; (& nbsp; "content-Type: & nbsp; text/html; & nbsp; charset = utf-8" & nbsp;); & nbsp; re curl crawls educational administration system data
Hello, I have come to consult you again.
The code is like this
Header ("content-Type: text/html; charset = utf-8 ");
Require_once 'search. php ';
// Step 1: submit data, generate a cookie, and save the cookie in the temporary directory
$ Cookiejar = realpath ('cookie.txt ');
$ Id = $ _ GET ['id'];
$ Password = $ _ GET ['password'];
$ Year = $ _ GET ['Year'];
$ Term = $ _ GET ['term '];
$ Ch = curl_init ();
$ Login_url = "http: // 211.67.32.51/default3.aspx ";
$ CurlPost = "_ VIEWSTATE = signature % 2BO2w8bzxmPjs % 2BPjs7Pjs % signature % 2BOz4% 2 BOzs % 2BOz4% signature % 2 FCbCuTw % 3D & tbYHM = k061138526 & tbPSW = 100311 & ddlSF = students & imgDL. x = 40 & imgDL. y = 7 ";
$ CurlPost = iconv ("UTF-8", "GBK", $ curlPost );
Curl_setopt ($ ch, CURLOPT_URL, $ login_url );
// When enabled, the header file information is output as a data stream.
Curl_setopt ($ ch, CURLOPT_PROXY, 'jackdowosn .gnway.net: 81 ');
Curl_setopt ($ ch, CURLOPT_HEADER, 0 );
Curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.8.1.1) Gecko/20061204 Firefox/4 ");
Curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, true );
Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 );
Curl_setopt ($ ch, CURLOPT_REFERER, 'http: // 211.67.32.51 /');
Curl_setopt ($ ch, CURLOPT_POST, 1 );
Curl_setopt ($ ch, CURLOPT_POSTFIELDS, $ curlPost );
// Sets the file for storing cookie information after the connection ends.
Curl_setopt ($ ch, CURLOPT_COOKIEJAR, $ cookiejar );
$ Data = curl_exec ($ ch );
// $ Data = mb_convert_encoding ($ data, "UTF-8", "GBK ");
// Echo' '. $ Data .' ';
$ CurlPost = "xh = k061110826 ";
$ CurlPost = iconv ("UTF-8", "GBK", $ curlPost );
Curl_setopt ($ ch, CURLOPT_URL, "http: // 211.67.32.51/xscj. aspx? Xh = k061138526 ");
// When enabled, the header file information is output as a data stream.
Curl_setopt ($ ch, CURLOPT_PROXY, 'jackdowosn .gnway.net: 81 ');
Curl_setopt ($ ch, CURLOPT_HEADER, 0 );
Curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.8.1.1) Gecko/20061204 Firefox/4 ");
Curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, true );
Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 );
Curl_setopt ($ ch, CURLOPT_REFERER, 'http: // 211.67.32.51 /');
Curl_setopt ($ ch, CURLOPT_POST, 0 );
Curl_setopt ($ ch, CURLOPT_POSTFIELDS, $ curlPost );
// Sets the file for storing cookie information after the connection ends.
Curl_setopt ($ ch, CURLOPT_COOKIEFILE, $ cookiejar );
$ Data = curl_exec ($ ch );
$ Data = mb_convert_encoding ($ data, "UTF-8", "GBK ");
Preg_match_all ('/\ /I ', $ data, $ matches );
// The above pattern modifier cannot be added with s
// File_put_contents ("d: // value.txt", $ matches [1] [0]);
// Echo var_dump ($ matches [1] [0])."
When the above program is executed to search3, other programs can return data normally. When I asked a senior, his answer was "I don't know. I have the impression that we have encountered this problem in the square system. it may be that the parameter data transmission is wrong, the encoding is wrong, or the Referer parameter is not set ". Please help me to see where the problem is. if you are interested, you can debug it for me. the proxy servers are actually available. Below are several post parameters and header information
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.