Take the square educational system as an example, using PHP simulation to crawl the timetable, empty classroom

Source: Internet
Author: User
Course lattice and super curriculum these two applications, presumably college students are very familiar with their own school number and the educational system password, you can import their own timetable, anytime and anywhere can be viewed on the phone.

In fact, a little bit about PHP, we can also do a similar web application.

1, solve the verification code

In fact, this is a small square bug, when we enter the login interface, the browser will go to request the server, the server will generate a CAPTCHA image. If we do not request this picture, then the square backstage will not generate the corresponding verification code, so we have an opportunity, let me happy a while ~ at this time, we do not fill in the verification code, can be very smooth entry. You can disable access to the verification code address on your computer, and then try it is not true ~ Of course, this is only valid for the affirmative.

2,php's Curl Simulation landing

Here is a script house directly to the explanation of curl it http://www.jb51.net/article/51299.htm

Next is the relevant code, I believe a lot of people and I just like to see examples, for a lengthy explanation, turn around and go ... But this habit is not good ... Don't say much nonsense!

Analog login function Curl_request ($url, $post = ", $cookie =", $returnCookie =0) {$curl = Curl_init ();        curl_setopt ($curl, Curlopt_url, $url); curl_setopt ($curl, Curlopt_useragent, ' mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1;        trident/6.0) ');        curl_setopt ($curl, curlopt_followlocation, 1);        curl_setopt ($curl, Curlopt_autoreferer, 1); curl_setopt ($curl, Curlopt_referer, "This must be replaced by the educational system landing URL");            Fill in the educational system URL if ($post) {curl_setopt ($curl, Curlopt_post, 1);        curl_setopt ($curl, Curlopt_postfields, Http_build_query ($post));        } if ($cookie) {curl_setopt ($curl, Curlopt_cookie, $cookie);        } curl_setopt ($curl, Curlopt_header, $returnCookie);        curl_setopt ($curl, Curlopt_timeout, 20);        curl_setopt ($curl, Curlopt_returntransfer, 1);        $data = curl_exec ($curl);        if (Curl_errno ($curl)) {return curl_error ($curl);        } curl_close ($curl); if ($returnCookie{List ($header, $body) = Explode ("\r\n\r\n", $data, 2); Preg_match_all ("/set\-cookie: ([^;]            *);/", $header, $matches);            $info [' cookie '] = substr ($matches [1][0], 1);            $info [' content '] = $body;        return $info;        }else{return $data; }    }

3, hidden fields of the educational system landing page

Give me a chestnut.

    

These things also need to be brought on the landing, by the way the function, by the way the blogger's school ... The Royal University of farming (mainly the use of regular expressions)


Hidden fields on landing page
function GetView () { $url = ' http://jw.hzau.edu.cn/default2.aspx '; $result = Curl_request ($url); $pattern = '//is '; Preg_match_all ($pattern, $result, $matches); $res [0] = $matches [1][0]; return $res [0];}

Returns the hidden value of the classroom query page
Private Function Getviewjs ($cookie, $xh) {
$url = "http://jw.hzau.edu.cn/xxjsjy.aspx?xh={$xh}";
$result = Curl_request ($url, ", $cookie);
$pattern = '//is ';
Preg_match_all ($pattern, $result, $matches);
$res [0] = $matches [1][0];
return $res [0];
}

Acquisition of 4,cookie

function login ($xh, $pwd) {    $url = ' http://jw.hzau.edu.cn/default2.aspx ';    $post [' __viewstate '] = $this->getview ();    $post [' txtusername '] = $xh; Fill in the study number    $post [' TextBox2 '] = $pwd;  Fill in the password    $post [' txtsecretcode '] = ';    $post [' lblanguage '] = ';    $post [' Hidpdrs '] = ';    $post [' hidsc '] = ';    $post [' RadioButtonList1 '] = iconv (' utf-8 ', ' gb2312 ', ' students ');    $post [' Button1 '] = iconv (' utf-8 ', ' gb2312 ', ' login ');    $result = Curl_request ($url, $post, ", 1);    return $result [' Cookie '];

5, let us try to check the function of the timetable, the format is a bit messy, we make a look, I turn the timetable into a two-dimensional associative array

Returns the timetable string private function Classresult ($xh, $pwd) {date_default_timezone_set ("PRC");//time zone setting $classList = "";//declaring the schedule variable    $cookie = $this->login ($xh, $pwd); $view = $this->getviewjs ($cookie, $XH),//Verify that the password is correct//if the password is correct if (!empty ($view)) {$url = "Http://jw.hzau.        edu.cn/xskbcx.aspx?xh={$xh} ";  $result = Curl_request ($url, ", $cookie); Save the Cookiespreg_match_all ('/
 
  ([\w\w]*?)        <\/table>/', $result, $out); $table = $out [0][0]; Get the whole timetable Preg_match_all ('/([\w\w]*?)        <\/td>/', $table, $out);        $TD = $out [1];    $length = count ($TD); Get a list of courses for ($i =0; $i < $length; $i + +) {$TD [$i] = Str_replace ("
  
"," ", $td [$i]); $reg = "/{(. *)}/"; if (!preg_match_all ($reg, $TD [$i], $matches)) {unset ($td [$i]); }} $TD = Array_values ($TD); Re-index the list of courses $tdlength = count ($TD), for ($i =0; $i < $tdLength; $i + +) {$TD [$i] = iconv (' GB2312 ', ' UTF-8 ', $td [$i]);} Convert the timetable into the array form function converttotable ($table) {$list = array (' sun ' = = Array (' ' + ' = ') ', ' 3,4 ' = = ', ' 5,6 ' = ', ' ' 7,8 ' = ' + ', ' 9,10 ' = ' and ' ' ", ' mon ' = = Array (' + ' = ' = ', ' 3,4 ' = ' ', ' 5,6 ' = = ', ' 7,8 ' = ', ' 9,10 ' = ') ' '), ' Tues ' + Array (' ' + ' = ', ' 3,4 ' = ', ' ' = ', ' 5,6 ' = ', ' 7,8 ' = ', ' 9,10 ' = ') ', ' wed ' = = Array (' 1, 2 ' + ', ' 3,4 ' + ', ' 5,6 ' + ', ' 7,8 ', ' + ', ' 9,10 ' + ', ' thur ' = array (' + ' = ' = ', ' 3,4 ' = ') ' , ' 5,6 ' = ', ' 7,8 ' + ', ' 9,10 ' and ' = ', ' Fri ' + array (' + ' = ' = ', ' 3,4 ' = ', ' 5,6 ' = ', ' 7,8 ' = ') ' "', ' 9,10 ' + ') ', ' sat ' = + Array (' ' + ' = ', ' 3,4 ' + ', ' 5,6 ' = ', ' 7,8 ' = ', ' 9,10 ' = ') '); $week = Array ("Sun" and "Sunday", "mon" and "Monday", "Tues" and "Tuesday", "Wed" and "Wednesday", "Thur" and "Thursday", "Fri" and "Friday", "sat" = "Saturday"; Order = Array (' 3,4 ', ' 5,6 ', ' 7,8 ', ' 9,10 '), foreach ($table as $key = = $value) {$class = $value; foreach ($week as $ke y = $weekDay) {$pos = Strpos ($class, $weekDay);//Echo $pos, if ($pos) {$weekArrayDay = $key;//Get First dimension key foreach in list array ($order as $key = $orderClass) {$pos = Strpos ($class, $orderClass), if ($pos) {$weekArrayOrder = $orderClass;//Get the course is section break;}} Break;}} $list [$weekArrayDay] [$weekArrayOrder] = $class;} return $list;} Call function return converttotable ($TD); }else{return 0; }}

6, then try to check the function of empty classroom

Empty classroom Query Result public Function Roomresult () {$xh = "";  Set the study number $pwd = "";        Number corresponding to the password $cookie = $this->login ($xh, $pwd);        $url = "http://jw.hzau.edu.cn/xs_main.aspx?xh={$xh}";  $result = Curl_request ($url, ", $cookie);        The saved cookies $url = "http://jw.hzau.edu.cn/xxjsjy.aspx?xh={$xh}";        $post [' Button2 '] = iconv (' utf-8 ', ' gb2312 ', ' empty classroom query ');        $post [' __eventargument ']= ';        $post [' __eventtarget ']= ';        $post [' __viewstate '] = $this->getviewjs ($cookie, $XH);        $post [' ddldsz '] = iconv (' utf-8 ', ' gb2312 ', ' single '); $post [' ddlsyxn '] = ' 2014-2015 ';         Academic year $post [' ddlsyxq '] = ' 1 ';        $post [' jslb '] = ';        $post [' xiaoq '] = ';  $post [' KSSJ ']=$_get[' start ']; The time of submission of the start query $post [' SJD ']=$_get[' class '];//submitted the course section $post [' xn ']= ' 2014-2015 ';//school year $post [' Xq ']= ' 2 ';// The semester $post [' Xqj ']= ' 6 ';//The Day of the week $post [' dpdatagrid1:txtpagesize ']=90;//shows the number of bars per page $result = curl_request ($ur L, $post, $cookie, 0);p reg_match_all ('/
 
  
   ]+>[^>]+span>/', $result, $out); $tip = Iconv (' gb2312 ', ' utf-8 ', $out [0][3]);//Get prompt content at the front of the page Preg_match_all ('/ 
   
    ([\w\w]*?)        <\/table>/', $result, $out); $table = Iconv (' gb2312 ', ' utf-8 ', $out [0][0]);    Get the query list $this->load->view ("classroom", Array (' tip ' = = $tip, ' table ' = + $table)); }
 
   

  
 

This is all summed up, each school's educational system is not the same, then we can use the Firebug Firefox browser to grasp the package to see exactly what was submitted. If not successful, you have to look at what you should submit the post up no, if not successful, the amount ... Can contact me imzhongshan@126.com

That's all, go ahead and try it!

The above introduction to the square educational system as an example, with PHP simulation landing crawl timetable, empty classrooms, including the content of the area, I hope that the PHP tutorial interested friends have helped.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.