Data solution of Curl crawl educational administration system

Source: Internet
Author: User
Curl Crawl Educational system data
Hello everyone again to ask you, I now have this situation
The code is like this.


Header ("content-type:text/html; Charset=utf-8 ");
Require_once ' search.php ';
The first step: submit data, generate a cookie, save the cookie in a temporary directory
$cookiejar = Realpath (' cookie.txt ');
$id =$_get[' id '];
$password =$_get[' password '];
$year =$_get[' year ';
$term =$_get[' term '];
$ch = Curl_init ();
$login _url = "http://211.67.32.51/default3.aspx";
$curlPost = "__viewstate=ddw5nti3mzm0ntq7ddw7bdxppde%2bo2k8nt47pjtsphq8o2w8atw4pjtppdexpjs% 2bo2w8ddxwpdtwpgw8b25jbgljazs%2bo2w8d2luzg93lmnsb3nlkclcozs%2bpj47oz47ddxwpgw8vmlzawjszts%2bo2w8bzxmpjs% 2bpjs7pjs%2bpjt0pha8bdxwaxnpymxloz47bdxvpgy%2boz4%2bozs%2boz4%2bo2w8aw1nrew7aw1nvem7aw1nuu1noz4% 2biyfpvg3fujyu8xx773lo%2fcbcutw%3d&tbyhm=k061141026&tbpsw=100311&ddlsf= Students &imgDL.x=40& Imgdl.y=7 ";
$curlPost = Iconv ("UTF-8", "GBK", $curlPost);
curl_setopt ($ch, Curlopt_url, $login _url);
When enabled, the header file information is output as a data stream
curl_setopt ($ch, Curlopt_proxy, ' jackdowosn.gnway.net:81 ');
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_useragent, "mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.8.1.1) gecko/20061204 FIREFOX/4 ");
curl_setopt ($ch, curlopt_followlocation,true);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_referer, ' http://211.67.32.51/');
curl_setopt ($ch, Curlopt_post, 1);
curl_setopt ($ch, Curlopt_postfields, $curlPost);
Set up files to save cookie information after connection ends
curl_setopt ($ch, Curlopt_cookiejar, $cookiejar);
$data =curl_exec ($ch);
$data = mb_convert_encoding ($data, "Utf-8", "GBK");
Echo ' <xmp>'. $data. '</xmp> ';
$curlPost = "xh=k061141026";
$curlPost = Iconv ("UTF-8", "GBK", $curlPost);
curl_setopt ($ch, Curlopt_url, "http://211.67.32.51/xscj.aspx?xh=K061141026");
When enabled, the header file information is output as a data stream
curl_setopt ($ch, Curlopt_proxy, ' jackdowosn.gnway.net:81 ');
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_useragent, "mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.8.1.1) gecko/20061204 FIREFOX/4 ");
curl_setopt ($ch, curlopt_followlocation,true);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_referer, ' http://211.67.32.51/');
curl_setopt ($ch, curlopt_post, 0);
curl_setopt ($ch, Curlopt_postfields, $curlPost);
Set up files to save cookie information after connection ends
curl_setopt ($ch, Curlopt_cookiefile, $cookiejar);
$data =curl_exec ($ch);
$data = mb_convert_encoding ($data, "Utf-8", "GBK");
Preg_match_all ('/\ /I ', $data, $matches);
The above mode modifier cannot add s
File_put_contents ("D://value.txt", $matches [1][0]);
Echo Var_dump ($matches [1][0]). "





";
echo $matches [1][0];
Echo ' . $data. ' ';
Echo Search3 ($id, $year, $term, $ch, $matches [1][0]);
?>


function Search3 ($id, $year, $term, $ch, $value) {
$data =file_get_contents ("D://value.txt");
curl_setopt ($ch, Curlopt_proxy, ' jackdowosn.gnway.net:81 ');
$curlPost = "xh=k061141026&__viewstate= $value &button2= &ddlkclx= compulsory &xn=2012-2013&xq=1 according to the term of the academic year";
$curlPost = Iconv ("UTF-8", "GBK", $curlPost);
curl_setopt ($ch, Curlopt_useragent, "mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.8.1.1) gecko/20061204 FIREFOX/4 ");
curl_setopt ($ch, curlopt_followlocation,true);
curl_setopt ($ch, Curlopt_url, "http://211.67.32.51/xscj.aspx");
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_post, 1);
curl_setopt ($ch, Curlopt_referer, "http://211.67.32.51/xscj.aspx?xh=K061141026");
curl_setopt ($ch, Curlopt_postfields, $curlPost);
curl_setopt ($ch, Curlopt_cookiefile, $cookiejar); To callback a cookie
$data = curl_exec ($ch);
Curl_close ($ch);
$data = mb_convert_encoding ($data, "Utf-8", "GBK");
/*preg_match_all ('/\\s*\ (.*?) \<\/td\>\s*\ (. *?) \<\/td\>/is ', $data, $matches);
foreach ($matches [1] as $key = $val)
$nav = $nav. " \ n ". $val. "---" . $matches [2] [$key];*/
return $data;
}


When the above program executes to SEARCH3, the other can return the data normally. I asked a senior, and his answer was, "I don't know." I have an impression that we do this problem in the square system, it is possible that the parameter is wrong, it may be the wrong encoding, it may not be set referer this parameter. Please help me to see where the problem is, interested can help me to debug, the proxy server is real available. Here are a few post parameters and header information
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.