PHP crawl Baidu Snapshot, Baidu included, Baidu hot Word program code _php Tutorial

Source: Internet
Author: User
Tags echo date
If you can find a little more careful to see a problem, we have a few crawl Baidu included or snapshot or hot Word program has a function file_get_contents (), he is the PHP collection page commonly used OH.
The code is as follows Copy Code


/*
Crawl Baidu included code
*/
function Baidu ($s) {
$baidu = "Http://www.baidu.com/s?wd=site%3A". $s;
$site =file_get_contents ($baidu);
$site =iconv ("gb2312", "UTF-8", $site);
Ereg ("Find related pages (. *),", $site, $count);
$count =str_replace ("Find related pages", "", $count);
$count =str_replace ("article,", "", $count);
$count =str_replace ("About", "", $count);
$count =str_replace (",", "", $count);
return $count [0];
}

echo Baidu (www.hzhuti.com); Get the number of good topics included in Baidu

?>

Get Baidu's hot words

code as follows copy code

/**
* * @user Little Jay
* @return Array returns the hot Word data from Baidu (array return)
*/
function Getbaiduhotkeyword ()
{
$templateRss = file_get_contents (' Http://top.baidu.com/rss_xml.php?p=top10 ');
If (Preg_match ('/

(.*)
/is ', $templateRss, $_description)) {
$TEMPLATERSS = $_description [0];
$templateRss = Str_replace ("&", "&", $TEMPLATERSS);
}
$TEMPLATERSS = " ". $templateRss;
$xml = simplexml_load_string ($TEMPLATERSS);
foreach ($xml->tbody->tr as $temp) {
if (!empty ($temp->td->a)) {
$keyArray [] = Trim (($temp->td->a));
}
}
return $keyArray;
}
Print_r (Getbaiduhotkeyword ());


This is on the Internet to find a slightly modified under the following code to write to the PHP file
Baidu included and Baidu snapshot time

code as follows copy code

$domain = "http://www.hzhuti.com/nokia/5230/* Domain name to query * *
$site _url = ' Http://www.baidu.com/s?wd=site%3A ';
$all = $site _url. $domain; /* All included URLs for domain */
$today = $all. ' &lm=1′; /* Domain name included in today's URL */
$utf _pattern = "/Find the relevant result number (. *)/";
$kz _pattern = "/(. *)/"; /* String to match the snapshot date */
$times = "/d{4}-d{1,2}-d{1,2}/"; /* Regular expressions that match the snapshot date, such as: 2011-8-4*/
$s 0 = @file_get_contents ($all); /* Place the Site:www.ninthday.net Web page into the $s0 string */
$s 1 = @file_get_contents ($today);
Preg_match ($utf _pattern, $s 0, $all _num); /* Match "find related results * *" * *
Preg_match ($utf _pattern, $s 1, $today _num);
Preg_match ($kz _pattern, $s 0, $temp);
Preg_match ($times, $temp [0], $screenshot);
if ($all _num[1] = = "")
$all _num[1] = 0;
if ($today _num[1] = = "")
$today _num[1] = 0;
if ($screenshot [0] = = "")
$screenshot [0] = "no snapshot";
?>


Test








Date Baidu included Baidu today included Baidu Snapshot Date

Baidu included: "target=" _blank ">


Baidu today included: "target=" _blank ">


Baidu Snapshot Date: ">



The above method is not strictly considered, if the server does not support the File_get_contents function we can not operate, so also use curl operation, this is more convenient to imitate the user Oh.

http://www.bkjia.com/PHPjc/631640.html www.bkjia.com true http://www.bkjia.com/PHPjc/631640.html techarticle If you have a little closer look to find a problem, we have a few crawl Baidu included or snapshot or hot Word program has a function file_get_contents (), he is a PHP collection network ...

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.