/** * Spider control */function Spidercontrol () {$user _agent = strtolower ($_server [' http_user_agent ']); $allow _spiders = Array (' Baiduspider ', ' Googlebot '); foreach ($allow _spiders as $spider) {$spider = Strtolower ($spider); if (Strpos ($user _agent, $spider)!== false) {return true; }} return false;}
Baidu Spider (Baiduspider):
http://help.baidu.com/question?prod_en=master&class=498
Baidu Spider (Baiduspider) Common questions answered:
http://help.baidu.com/question?prod_en=master&class=498&id=1000550
360 Spider
Http://lusongsong.com/blog/post/458.html
Major search engine spider name (http://www.boshan.com.cn/blog/3211.aspx):
1, Baidu Spider:baiduspider
Online information Baidu spider name Baiduspider, Baiduspider and so on, wash and sleep it, it is the Old almanac.
Baidu Spider's latest name is Baiduspider ( first letter capitalized ). The log also found baiduspider-image this Baidu-owned spider, checked the following information (in fact, directly look at the name can be ...) ), is a spider that crawls pictures.
Common Baidu's same type of spiders have the following: Baiduspider-mobile (crawl wap), baiduspider-image (grab pictures), Baiduspider-video (crawl video), Baiduspider-news (Crawl News).
Note: The above Baidu Spider is now common is baiduspider and baiduspider-image two kinds.
2, Google spider:Googlebot
This is a less controversial issue, but it is also said to be Googlebot. Google Spider's latest name is "compatible; googlebot/2.1; ". Also found Googlebot-mobile, see the name is crawling WAP content.
3, 360 spider:360Spider, it is a very "diligent grasping climb" spider.
4, Soso spider:sosospider, but also for it awarded a "diligent grasping climb" award of the Spider.
5, Yahoo! Spider: Yahoo!slurp China or Yahoo!
Name with slurp and spaces, name has a space robots name can use Slurp or Yahoo word description, do not know valid invalid.
6, Youdao spider:youdaobot, Yodaobot (two names have, Chinese pinyin less a U-letter pronunciation difference is very big GA, this will be less?) )
7, Sogou spider: Sogou News Spider
Sogou spiders also include the following: Sogou Web spider, Sogou inst spider, Sogou spider2, Sogou blog, Sogou News spider, Sogou Orion spider,
(Refer to some Web site robots file, Sogou spider name can be summed up with Sogou, cannot verify not know whether it is effective)
Look at the most authoritative Baidu robots.txt,http://www.baidu.com/robots.txt for Sogou Sogou spider cost a lot of bytes, accounted for a large territory.
"Sogou Web Spider;sogou inst spider;sogou spider2;sogou blog;sogou News Spider;sogou Orion Spider" Currently 6, with a blank name.
"Sogou Web spider/4.0", "Sogou News spider/4.0", "Sogou inst spider/4.0" can give it a "king of the name" award.
8, MSN Spider:msnbot, Msnbot-media (only see Msnbot-media in crazy climb ... )
9. Bing Spider:bingbot
On the line (compatible; bingbot/2.0;)
10, a search spider:yisouspider
11, Alexa Spider:ia_archiver
12, Yi Sou spider:easouspider
13, Instant Spider:jikespider
14, one Amoy net spider:etaospider
"Mozilla/5.0 (compatible; etaospider/1.0; HTTP//Omit/etaospider) "
According to the above spiders choose a few commonly used to allow the crawl, the rest can be through the robots screen capture. If you have a temporary space flow is still enough to use, such as the flow of tension to retain a few commonly used to shield off other spiders to save traffic. As for those spider crawl on the site can bring the use of value, the site's managers eyes are discerning.
Also found such as Yandexbot, Ahrefsbot and Ezooms.bot These spiders, it is said that these spiders foreign karma, the use of Chinese websites is very small. It's better to save resources.
Baidu Spider, Google spider, 360 spider how to distinguish?