intends to use PHP to implement a crawler, which is a crawl of the specified page of the picture of a program, the other parts have not been debugged well, first put this up
1<?PHP2 $string=file_get_contents("Http://www.baidu.com");3 Echo' Size: '.strlen($string)." </br> ";4 $length=strlen($string);5Searchimg ($string,$length);6 7 8 9 functionSearchimg ($string,$length){Ten for($i= 0;$i<$length;$i++) { One if(($string[$i]== ' s ') && ($string[$i+1]== ' R ') && ($string[$i+2]== ' C ')){ A $index=$i; - - $SCR=SEARCHSCR ($index,$length,$string);//format for "http://.......***" the - $type=judgetype ($SCR); - // - if($type! = "Error"){ + Echo' Location: '.$index.‘ </br> '; - Echo' Scourse: '.$SCR.‘ </br> '; + Echo' Type: '.$type." </br> "; A $filename= ' pic/'.$index.‘.‘.$type; at $handle=fopen($filename, "a"); - - $scrString=file_get_contents($SCR); - fwrite($handle,$scrString); - fclose($handle); - } in - } to + } - } the * functionJudgetype ($SCR){ $ $length=strlen($SCR);Panax Notoginseng - if((($SCR[$length-1]== ' F ' | |$SCR[$length-1]== ' F ')) && (($SCR[$length-2]== ' i ') | | ($SCR[$length-2]== ' I '))){ the return"GIF"; + } A Else if((($SCR[$length-1]== ' G ' | |$SCR[$length-1]== ' G ')) && (($SCR[$length-2]== ' P ') | | ($SCR[$length-2]== ' P '))) { the return"JPG"; + } - Else if((($SCR[$length-1]== ' G ' | |$SCR[$length-1]== ' G ')) && (($SCR[$length-2]== ' n ') | | ($SCR[$length-2]== ' N '))){ $ return"PNG"; $ } - Else if((($SCR[$length-1]== ' G ' | |$SCR[$length-1]== ' G ')) && (($SCR[$length-2]== ' E ') | | ($SCR[$length-2]== ' E '))){ - return"JPEG"; the } - Else{Wuyi return"Error"; the } - } Wu - About $ functionSEARCHSCR ($index,$length,$string){ - if($string[$index+5]=== "H"){ - - $SCR= ' '; A } + Else{ the - $SCR= ' http: '; $ } the the for($i=$index+5;$i<$length;$i++) { the if($string[$i]=== ' "'){ the //$SCR = $scr. ' "; - Break; in } the Else{ the $SCR=$SCR.$string[$i]; About } the } the return $SCR; the //echo $scr; + - the }Bayi the the -?>
Mainly said some shortcomings, the dynamic generation of the picture can not be included, CSS images can not be included, this is the next to improve the place, the crawler is constantly perfected, PHP string is still pretty tired ...
PHP write crawler, crawl to specify various images on the site page