Php implements recursive crawling of web page instances. Php implements recursive crawling of web page instances as follows: 123456789101112131415161718192021222324252627282930? Phpclasscrawler {private $ _ depth5; private php implements recursive crawling of web page instances
The details are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
Class crawler { Private $ _ depth = 5; Private $ _ urls = array (); Function extract_links ($ url) { If (! $ This-> _ started ){ $ This-> _ started = 1; $ Curr_depth = 0; } Else { $ Curr_depth ++; } If ($ curr_depth <$ this-> _ depth) { $ Data = file_get_contents ($ url ); If (preg_match_all ('/((? : Http | https )://(? : Www .)*(? : [A-zA-Z0-9 _-] {}. + [a-zA-Z0-9 _] {1,}) {1 ,}(? : [A-zA-Z0-9 _/.-? &: %,!;] *)/', $ Data, $ urls12 )) { Foreach ($ urls12 [0] as $ k => $ v ){ $ Check = get_headers ($ v, 1 ); If (strstr ($ v, $ url) & $ check [0] = 'http/1.1 200 OK '&&! Array_search ($ v, $ this-> _ urls) & $ curr_depth <$ this-> _ depth ){ $ This-> _ urls [] = $ v; $ This-> extract_links ($ v ); } } } } Return $ this-> _ urls; } } ?> |
Region Details: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 25 26 27 28 29 30? Php class crawler {private $ _ depth = 5; private...