This document records the search spider that needs to be set in the robots.txt list of the world comparison. For details about how to set the directory that does not want to be indexed by the search engine, refer to the settings below.
Of course, you can also set it from robots.txt.
The following are famous search engine spider names:
Google's spider: Googlebot
Baidu's spider: baidusp
Yahoo's spider: Yahoo Slurp
MSN Spider: Msnbot
Spider of Altavista: Scooter
Lycos: Lycos_Spider _ (T-Rex)
Alltheweb Spider: FAST-WebCrawler/
INKTOMI: Slurp
For more information, see this article:
User-agent (User agent settings): (spider name)
Reject: (file name)
User-agent: Black Hole
Disallow :/
User-agent: Titan
Disallow :/
User-agent: WebStripper
Disallow :/
User-agent: NetMechanic
Disallow :/
User-agent: CherryPicker
Disallow :/
User-agent: EmailCollector
Disallow :/
User-agent: EmailSiphon
Disallow :/
User-agent: WebBandit
Disallow :/
User-agent: EmailWolf
Disallow :/
User-agent: ExtractorPro
Disallow :/
User-agent: CopyRightCheck
Disallow :/
User-agent: Crescent
Disallow :/
User-agent: NICErsPRO
Disallow :/
User-agent: Wget
Disallow :/
User-agent: SiteSnagger
Disallow :/
User-agent: ProWebWalker
Disallow :/
User-agent: CheeseBot
Disallow :/
User-agent: mozilla/4
Disallow :/
User-agent: mozilla/5
Disallow :/
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)