The difference between a normal user and a search engine spider crawling is the user agent that is sent,
Look at the website log file can find Baidu Spider name contains Baiduspider, and Google is Googlebot, so we can determine the user agent sent to decide whether to cancel the access of ordinary users, write functions as follows:
Copy CodeThe code is as follows:
function isallowaccess ($directForbidden = FALSE) {
$allowed = Array ('/baiduspider/i ', '/googlebot/i ');
$user _agent = $_server[' http_user_agent ');
$valid = FALSE;
foreach ($allowed as $pattern) {
if (Preg_match ($pattern, $user _agent)) {
$valid = TRUE;
Break
}
}
if (! $valid && $directForbidden) {
Exit ("404 Not Found");
}
return $valid;
}
It is OK to refer to this function in the header of the page you want to block access to, and the following two methods are called:
Copy CodeThe code is as follows:
if (!isallowaccess ()) {
Exit ("404 Not Found");
}
Or
Isallowaccess (TRUE);
http://www.bkjia.com/PHPjc/321062.html www.bkjia.com true http://www.bkjia.com/PHPjc/321062.html techarticle ordinary users and search engine spider crawling is the difference between sending the user agent, look at the website log file can find Baidu Spider name contains Baiduspider, and Google is Googlebot, so we ...