Crawl Baidu search results page by keyword using PHP's curl
I want to achieve is every time according to the keyword search Baidu results page, such as Baidu search results are such, Baidu promotion content Ka Zhongwen:
And I use curl to crawl the result is this:
That is to say that each crawl results are unable to crawl to the content of Baidu promotion. Ask which Master can guide, I just get started, hope you are not hesitate to guide. I thanked him first.
Where the PHP fetch code is as follows:
$url = "http://www.baidu.com/s?wd= life force";
Constructs the header, simulates the browser request
$header = Array (
"Host:www.baidu.com",
"Content-type:application/x-www-form-urlencoded",//post request
"Connection:keep-alive",
' Referer:http://www.baidu.com ',
' User-agent:mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; trident/5.0; Bidubrowser 2.6) '
);
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_httpheader, $header);
curl_setopt ($ch, Curlopt_returntransfer, 1);
Perform
$content = curl_exec ($ch);
if ($content = = FALSE) {
echo "Error:". Curl_error ($ch);
}
Shut down
Curl_close ($ch);
Output results
Echo $content;
?>
------to solve the idea----------------------
Your user-agent is not well simulated, so you can't.
In fact, there is no need to use post, directly with get can be.
Modify the following:
$url = "http://www.baidu.com/s?wd= life force";
$header = Array (
' user-agent:mozilla/5.0 (Windows NT 5.1) applewebkit/537.36 (khtml, like Gecko) chrome/33.0.1750.146 safari/537.36 '
);
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_httpheader, $header);
curl_setopt ($ch, Curlopt_returntransfer, 1);
Perform
$content = curl_exec ($ch);
if ($content = = FALSE) {
echo "Error:". Curl_error ($ch);
}
Shut down
Curl_close ($ch);
Output results
Echo $content;