PHP Implementation Simulation Landing founder's educational system crawl Timetable _php tutorial

Source: Internet
Author: User

PHP Implementation Simulation Landing founder's educational system crawl timetable


This article mainly introduces the PHP implementation of the simulation of the founder's educational system to crawl the relevant information, the need for friends can refer to the following

Course lattice and super curriculum these two applications, presumably college students are very familiar with their own school number and the educational system password, you can import their own timetable, anytime and anywhere can be viewed on the phone.

In fact, a little bit about PHP, we can also do a similar web application.

1, solve the verification code

In fact, this is a small square bug, when we enter the login interface, the browser will go to request the server, the server will generate a CAPTCHA image. If we do not request this picture, then the square backstage will not generate the corresponding verification code, so we have an opportunity, let me happy a while ~ at this time, we do not fill in the verification code, can be very smooth entry. You can disable access to the verification code address on your computer, and then try it is not true ~ Of course, this is only valid for the affirmative.

2,php's Curl Simulation landing

Next is the relevant code, I believe a lot of people and I just like to see examples, for a lengthy explanation, turn around and go ... But this habit is not good ... Don't say much nonsense!

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21st

22

23

24

25

26

27

28

29

30

31

32

33

Analog Login

function Curl_request ($url, $post = ", $cookie =", $returnCookie =0) {

$curl = Curl_init ();

curl_setopt ($curl, Curlopt_url, $url);

curl_setopt ($curl, Curlopt_useragent, ' mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; trident/6.0) ');

curl_setopt ($curl, curlopt_followlocation, 1);

curl_setopt ($curl, Curlopt_autoreferer, 1);

curl_setopt ($curl, Curlopt_referer, "This must be replaced by the educational system landing URL"); Fill in the educational system URL

if ($post) {

curl_setopt ($curl, Curlopt_post, 1);

curl_setopt ($curl, Curlopt_postfields, Http_build_query ($post));

}

if ($cookie) {

curl_setopt ($curl, Curlopt_cookie, $cookie);

}

curl_setopt ($curl, Curlopt_header, $returnCookie);

curl_setopt ($curl, Curlopt_timeout, 20);

curl_setopt ($curl, Curlopt_returntransfer, 1);

$data = curl_exec ($curl);

if (Curl_errno ($curl)) {

Return Curl_error ($curl);

}

Curl_close ($curl);

if ($returnCookie) {

List ($header, $body) = Explode ("\r\n\r\n", $data, 2);

Preg_match_all ("/set\-cookie: ([^;] *);/", $header, $matches);

$info [' cookie '] = substr ($matches [1][0], 1);

$info [' content '] = $body;

return $info;

}else{

return $data;

}

}

3, hidden fields of the educational system landing page

Give me a chestnut.

Copy the code code as follows:

These things also need to be brought on the landing, by the way the function, by the way the blogger's school ... The Royal University of farming (mainly the use of regular expressions)

?

1

2

3

4

5

6

7

8

9

13 /p>

+

/

/

/

+

Hidden fields on landing page

function GetView () {

$url = ' http://jw.hzau.edu.cn/default2.aspx ';

$result = Curl_request ($url);

$pattern = '//is ';

Preg_match_all ($pattern, $result, $matches);

$res [0] = $matches [1][0];

return $res [0];

}

Returns the hidden value of the classroom query page

Private Function Getviewjs ($cookie, $xh) {

$url = "http://jw.hzau.edu.cn/xxjsjy.aspx?xh={$xh}";

$result = Curl_request ($url, ", $cookie);

$pattern = '//is ';

Preg_match_all ($pattern, $result, $matches);

$res [0] = $matches [1][0];

return $res [0];

}

Acquisition of 4,cookie

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

function login ($xh, $pwd) {

$url = ' http://jw.hzau.edu.cn/default2.aspx ';

$post [' __viewstate '] = $this->getview ();

$post [' txtusername '] = $xh; Fill in the study number

$post [' TextBox2 '] = $pwd; Fill in the password

$post [' txtsecretcode '] = ';

$post [' lblanguage '] = ';

$post [' Hidpdrs '] = ';

$post [' hidsc '] = ';

$post [' RadioButtonList1 '] = iconv (' utf-8 ', ' gb2312 ', ' students ');

$post [' Button1 '] = iconv (' utf-8 ', ' gb2312 ', ' login ');

$result = Curl_request ($url, $post, ", 1);

return $result [' Cookie '];

}

5, let us try to check the function of the timetable, the format is a bit messy, we make a look, I turn the timetable into a two-dimensional associative array

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21st

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

Return a timetable string

Private Function Classresult ($xh, $pwd) {

Date_default_timezone_set ("PRC"); Time zone settings

$classList = "";//Declare the schedule variable

$cookie = $this->login ($xh, $pwd);

$view = $this->getviewjs ($cookie, $XH);//Verify that the password is correct

If the password is correct

if (!empty ($view)) {

$url = "http://jw.hzau.edu.cn/xskbcx.aspx?xh={$xh}";

$result = Curl_request ($url, ", $cookie); Saved cookies

Preg_match_all ('/

$table = $out [0][0]; Get the whole timetable

Preg_match_all ('/

([\w\w]*?) <\/table>/', $result, $out);
([\w\w]*?) <\/td>/', $table, $out);

$TD = $out [1];

$length = count ($TD);

Get a list of courses

for ($i =0; $i < $length; $i + +) {

$TD [$i] = Str_replace ("
"," ", $td [$i]);

$reg = "/{(. *)}/";

if (!preg_match_all ($reg, $TD [$i], $matches)) {

Unset ($TD [$i]);

}

}

$TD = Array_values ($TD); Re-index the list of courses

$tdLength = count ($TD);

for ($i =0; $i < $tdLength; $i + +) {

$TD [$i] = iconv (' GB2312 ', ' UTF-8 ', $td [$i]);

}

Convert a timetable into an array form

function ConvertToTable ($table) {

$list = Array (

' Sun ' = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' mon ' = = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' Tues ' = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' Wed ' = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' Thur ' = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' Fri ' = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

),

' sat ' = = Array (

' + ' and '

' 3,4 ' = ',

' 5,6 ' = ',

' 7,8 ' = ',

' 9,10 ' and ' = '

)

);

$week = Array ("Sun" = "Sunday", "mon" = "Monday", "tues" = "Tuesday", "Wed" = "Wednesday", "Thur" and "Thursday", "Fri" and "Friday", "sat" = > "Saturday");

$order = Array (' Up ', ' 3,4 ', ' 5,6 ', ' 7,8 ', ' 9,10 ');

foreach ($table as $key = = $value) {

$class = $value;

foreach ($week as $key = = $weekDay) {

$pos = Strpos ($class, $weekDay);

Echo $pos;

if ($pos) {

$weekArrayDay = $key; Gets the first dimension key in the list array

foreach ($order as $key = = $orderClass) {

$pos = Strpos ($class, $orderClass);

if ($pos) {

$weekArrayOrder = $orderClass; Getting the course is the section

Break

}

}

Break

}

}

$list [$weekArrayDay] [$weekArrayOrder] = $class;

}

return $list;

}

Calling functions

Return converttotable ($TD);

}else{

return 0;

}

}

6, then try to check the function of empty classroom

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21st

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

Empty classroom Query Results

Public Function Roomresult () {

$xh = ""; Set the study number

$pwd = ""; Number corresponding to the password

$cookie = $this->login ($xh, $pwd);

$url = "http://jw.hzau.edu.cn/xs_main.aspx?xh={$xh}";

$result = Curl_request ($url, ", $cookie); Saved cookies

$url = "http://jw.hzau.edu.cn/xxjsjy.aspx?xh={$xh}";

$post [' Button2 '] = iconv (' utf-8 ', ' gb2312 ', ' empty classroom query ');

$post [' __eventargument ']= ';

$post [' __eventtarget ']= ';

$post [' __viewstate '] = $this->getviewjs ($cookie, $XH);

$post [' ddldsz '] = iconv (' utf-8 ', ' gb2312 ', ' single ');

$post [' ddlsyxn '] = ' 2014-2015 '; Year

$post [' ddlsyxq '] = ' 1 ';

$post [' jslb '] = ';

$post [' xiaoq '] = ';

$post [' KSSJ ']=$_get[' start ']; Start query time for submission

$post [' SJD ']=$_get[' class '];//the course section submitted

$post [' xn ']= ' 2014-2015 ';//school year

$post [' Xq ']= ' 2 ';//semester in which

$post [' Xqj ']= ' 6 ';//day of the week

$post [' dpdatagrid1:txtpagesize ']=90;//shows the number of bars per page

$result = Curl_request ($url, $post, $cookie, 0);

Preg_match_all ('/ ]+>[^>]+span>/', $result, $out);

$tip = Iconv (' gb2312 ', ' utf-8 ', $out [0][3]);//Get prompt content at the front of the page

Preg_match_all ('/ ([\w\w]*?) <\/table>/', $result, $out);

$table = Iconv (' gb2312 ', ' utf-8 ', $out [0][0]); Get a list of queries

$this->load->view ("classroom", Array (' tip ' = = $tip, ' table ' = + $table));

}

This is all summed up, each school's educational system is not the same, then we can use the Firebug Firefox browser to grasp the package to see exactly what was submitted.

The above mentioned is the whole content of this article, I hope you can like.

http://www.bkjia.com/PHPjc/1002346.html www.bkjia.com true http://www.bkjia.com/PHPjc/1002346.html techarticle PHP Implementation Simulation Landing founder's educational system crawl schedule This article mainly introduces the PHP implementation of the founder of the Senate teaching system to crawl the relevant information, the need for friends can refer to the following courses ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.