Php program code for capturing Baidu snapshots, Baidu indexing, and Baidu buzzwords-jerrylsxu
/*
Capture Baidu indexed code
*/
Function baidu ($ s ){
$ Baidu = "http://www.baidu.com/s? Wd = site % 3A ". $ s;
$ Site = file_get_contents ($ baidu );
// $ Site = iconv ("gb2312", "UTF-8", $ site );
Ereg ("find related webpage (. *),", $ site, $ count );
$ Count = str_replace ("find related webpage", "", $ count );
$ Count = str_replace ("article,", "", $ count );
$ Count = str_replace ("approx", "", $ count );
$ Count = str_replace (",", "", $ count );
Return $ count [0];
}
Echo baidu (www.hzhuti.com); // Obtain the number of topics included in baidu
?>
Get Baidu's buzzwords
/**
** @ User Xiaojie
* @ Return array returns Baidu's buzzword data (array return)
*/
Function getBaiduHotKeyWord ()
{
$ TemplateRss = file_get_contents ('http: // top.baidu.com/rss_xml.php? P = top10 ');
If (preg_match ('/
/Is ', $ templateRss, $ _ description )){
$ TemplateRss =$ _ description [0];
$ TemplateRss = str_replace ("&", "&", $ templateRss );
}
$ TemplateRss ="
". $ TemplateRss;
$ Xml = simplexml_load_String ($ templateRss );
Foreach ($ xml-> tbody-> tr as $ temp ){
If (! Empty ($ temp-> td-> )){
$ KeyArray [] = trim ($ temp-> td-> ));
}
}
Return $ keyArray;
}
Print_r (getBaiduHotKeyWord ());
This is a slight modification found on the internet. write the following code into the php file.
Baidu record and Baidu snapshot time
$ Domain = "http://www.hzhuti.com/nokia/5230/ * domain name to be queried */
$ Site_url = 'http: // www.baidu.com/s? Wd = site % 3A ';
$ All = $ site_url. $ domain;/* all URLs included in the domain name */
$ Today = $ all. '& lm = 1';/* the website of the domain name recorded today */
$ Utf_pattern = "/find the number of related results /";
$ Kz_pattern = "/(. *)/";/* string used to match the snapshot date */
$ Times = "/d {4}-d {1, 2}-d {1, 2}/";/* match the regular expression of the snapshot date, for example, 2011-8-4 */
$ S0 = @ file_get_contents ($ all);/* place the website: www.ninthday.net webpage into the $ s0 string */
$ S1 = @ file_get_contents ($ today );
Preg_match ($ utf_pattern, $ s0, $ all_num);/* match "find the number of related results "*/
Preg_match ($ utf_pattern, $ s1, $ today_num );
Preg_match ($ kz_pattern, $ s0, $ temp );
Preg_match ($ times, $ temp [0], $ screenshot );
If ($ all_num [1] = "")
$ All_num [1] = 0;
If ($ today_num [1] = "")
$ Today_num [1] = 0;
If ($ screenshot [0] = "")
$ Screenshot [0] = "no snapshot ";
?>
Test
Date |
Baidu |
Baidu recorded today |
Baidu snapshot date |
|
|
|
|
Baidu: "target =" _ blank ">
Baidu today: "target =" _ blank ">
Baidu snapshot date: ">
The above method has not been strictly considered. if the server does not support the file_get_contents function, we will not be able to operate it. Therefore, you can also use curl to operate it, which is more convenient for imitating users.