Php captures Baidu snapshots, Baidu records, and Baidu keyword program code. If you take a closer look, you will find a problem. in the following programs that capture Baidu's indexed snapshots or hot words, there is a function file_get_contents (), he is a php collection network. if you take a closer look, you will find a problem. The following programs that capture Baidu's indexed snapshots or hot words all have a function file_get_contents (), he is commonly used for collecting web pages in php.
The code is as follows: |
|
/* Capture Baidu indexed code */ Function baidu ($ s ){ $ Baidu = "http://www.baidu.com/s? Wd = site % 3A ". $ s; $ Site = file_get_contents ($ baidu ); // $ Site = iconv ("gb2312", "UTF-8", $ site ); Ereg ("find related webpage (. *),", $ site, $ count ); $ Count = str_replace ("find related webpage", "", $ count ); $ Count = str_replace ("article,", "", $ count ); $ Count = str_replace ("approx", "", $ count ); $ Count = str_replace (",", "", $ count ); Return $ count [0]; }
Echo baidu (www.hzhuti.com); // Obtain the number of topics included in baidu ?> |
Get Baidu's buzzwords
The code is as follows: |
|
/** ** @ User Xiaojie * @ Return array returns Baidu's buzzword data (array return) */ Function getBaiduHotKeyWord () { $ TemplateRss = file_get_contents ('http: // top.baidu.com/rss_xml.php? P = top10 '); If (preg_match ('/
/Is ', $ templateRss, $ _ description )){ $ TemplateRss =$ _ description [0]; $ TemplateRss = str_replace ("&", "&", $ templateRss ); } $ TemplateRss =" ". $ TemplateRss; $ Xml = simplexml_load_String ($ templateRss ); Foreach ($ xml-> tbody-> tr as $ temp ){ If (! Empty ($ temp-> td-> )){ $ KeyArray [] = trim ($ temp-> td-> )); } } Return $ keyArray; } Print_r (getBaiduHotKeyWord ()); |
This is a slight modification found on the internet. write the following code into the php file.
Baidu record and Baidu snapshot time
The code is as follows: |
|
$ Domain = "http://www.hzhuti.com/nokia/5230/ * domain name to be queried */ $ Site_url = 'http: // www.baidu.com/s? Wd = site % 3A '; $ All = $ site_url. $ domain;/* all URLs included in the domain name */ $ Today = $ all. '& lm = 1';/* the website of the domain name recorded today */ $ Utf_pattern = "/find the number of related results /"; $ Kz_pattern = "/(. *)/";/* string used to match the snapshot date */ $ Times = "/d {4}-d {1, 2}-d {1, 2}/";/* match the regular expression of the snapshot date, for example, 2011-8-4 */ $ S0 = @ file_get_contents ($ all);/* place the website: www.ninthday.net webpage into the $ s0 string */ $ S1 = @ file_get_contents ($ today ); Preg_match ($ utf_pattern, $ s0, $ all_num);/* match "find the number of related results "*/ Preg_match ($ utf_pattern, $ s1, $ today_num ); Preg_match ($ kz_pattern, $ s0, $ temp ); Preg_match ($ times, $ temp [0], $ screenshot ); If ($ all_num [1] = "") $ All_num [1] = 0; If ($ today_num [1] = "") $ Today_num [1] = 0; If ($ screenshot [0] = "") $ Screenshot [0] = "no snapshot "; ?>
Test
Date |
Baidu |
Baidu recorded today |
Baidu snapshot date |
|
|
|
|
Baidu: "target =" _ blank "> Baidu today: "target =" _ blank "> Baidu snapshot date: ">
|
The above method has not been strictly considered. if the server does not support the file_get_contents function, we will not be able to operate it. Therefore, you can also use curl to operate it, which is more convenient for imitating users.
Compile (), which is a php collection network...