The difference between a common user and a search engine spider crawling is that the user agent sent,
Looking at the website log file, we can find that Baidu Spider's name contains baiduspider, while Google's name is googlebot. In this way, we can determine whether to cancel normal user access by judging the user agent sent, write functions as follows:
CopyCodeThe Code is as follows: function isallowaccess ($ directforbidden = false ){
$ Allowed = array ('/baidusp/ I', '/googlebot/I ');
$ User_agent = $ _ server ['HTTP _ user_agent '];
$ Valid = false;
Foreach ($ allowed as $ pattern ){
If (preg_match ($ pattern, $ user_agent )){
$ Valid = true;
Break;
}
}
If (! $ Valid & $ directforbidden ){
Exit ("404 Not Found ");
}
Return $ valid;
}
It is okay to reference this function in the header of the page to be blocked for judgment. The call method is as follows:Copy codeThe Code is as follows: if (! Isallowaccess ()){
Exit ("404 Not Found ");
}
// Or
Isallowaccess (true );