You can verify that the Web crawler that accesses your server is really Googlebot (or another Google user agent). This method is useful if you are concerned that spammers or other troublemakers who claim to be Googlebot are visiting your website. Google will not publish a publicly available list of IP addresses for webmasters to add to whitelist. This is because these IP address ranges can change, causing problems for webmasters that have hardcoded them. Therefore, you must run DNS lookups as described in the following ways.
To verify that Googlebot is the caller, do the following:
- Use the
host
command to run a reverse DNS lookup on the IP address of the access server in your log.
- Verify that the domain name is in
googlebot.com
or google.com.
host
run a forward DNS lookup for the domain name that was retrieved using the command in step 1th. Verify that the domain name is consistent with the original IP address of the access server in your log.
Example 1:
> host 66.249.66.11.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.> host Crawl-66-249-66-1.googlebot.comcrawl-66-249-66-1.googlebot.com has address 66.249.66.1
Example 2:
> host 66.249.90.7777.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.> Host rate-limited-proxy-66-249-90-77.google.comrate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77
Verify the Googlebot (check for true Google bots)