PHP crawl Baidu Snapshot, Baidu included, Baidu hot Word program code, crawl Baidu Snapshot
/*
Crawl Baidu included code
*/
function Baidu ($s) {
$baidu = "Http://www.baidu.com/s?wd=site%3A". $s;
$site =file_get_contents ($baidu);
$site =iconv ("gb2312", "UTF-8", $site);
Ereg ("Find related pages (. *),", $site, $count);
$count =str_replace ("Find related pages", "", $count);
$count =str_replace ("article,", "", $count);
$count =str_replace ("About", "", $count);
$count =str_replace (",", "", $count);
return $count [0];
}
echo Baidu (www.hzhuti.com); Get the number of good topics included in Baidu
?>
Get Baidu's hot words
If (Preg_match ('/
$TEMPLATERSS = "
Print_r (Getbaiduhotkeyword ());
This is on the Internet to find a slightly modified under the following code to write to the PHP file
Baidu included and Baidu snapshot time
$domain = "http://www.hzhuti.com/nokia/5230/* Domain name to query * *
$site _url = ' Http://www.baidu.com/s?wd=site%3A ';
$all = $site _url. $domain; /* All included URLs for domain */
$today = $all. ' &lm=1′; /* Domain name included in today's URL */
$utf _pattern = "/Find the relevant result number (. *)/";
$kz _pattern = "/(.*)/”; /* String to match the snapshot date */
$times = "/d{4}-d{1,2}-d{1,2}/"; /* Regular expressions that match the snapshot date, such as: 2011-8-4*/
$s 0 = @file_get_contents ($all); /* Place the Site:www.ninthday.net Web page into the $s0 string */
$s 1 = @file_get_contents ($today);
Preg_match ($utf _pattern, $s 0, $all _num); /* Match "find related results * *" * *
Preg_match ($utf _pattern, $s 1, $today _num);
Preg_match ($kz _pattern, $s 0, $temp);
Preg_match ($times, $temp [0], $screenshot);
if ($all _num[1] = = "")
$all _num[1] = 0;
if ($today _num[1] = = "")
$today _num[1] = 0;
if ($screenshot [0] = = "")
$screenshot [0] = "no snapshot";
?>
<title>Test</title>
Date |
Baidu included |
Baidu today included |
Baidu Snapshot Date |
|
|
|
|
Baidu included: "target=" _blank ">
Baidu today included: "target=" _blank ">
Baidu Snapshot Date: ">
The above method is not strictly considered, if the server does not support the File_get_contents function we can not operate, so also use curl operation, this is more convenient to imitate the user Oh.
http://www.bkjia.com/PHPjc/1106387.html www.bkjia.com true http://www.bkjia.com/PHPjc/1106387.html techarticle php crawl Baidu Snapshot, Baidu included, Baidu hot Word program code, crawl Baidu snapshot?/* Crawl Baidu included code */function Baidu ($s) {$baidu = "Http://www.baidu.com/s?wd=site%3A" ...