Everyone should know that Baidu has been the entire station HTTPS and cancel Referer keyword display (details can see Webmaster's home article: Baidu Site Property function upgrade completely cancel Referer keyword display), then "Baidu Spider Referer" is what? Is there anything magical about it? Art Dragon SEO leader Liu Ming found through the Baidu Spider Referer can quickly locate part of the Site URL error (4xx or 5xx) reason.
What is the referer of Baidu Spider
Baidu Spider Referer, refers to when the Baidu spider crawl a URL, in the HTTP header with the Referer field. Please note that this definition has nothing to do with Baidu's recent statement to remove the Referer keyword data. This is the spider initiated by the HTTP request, Baidu and the removal of the user initiated. If Baidu Spider crawl Baidu homepage logo, will initiate such a request:
The Referer field above clearly indicates that he found and crawled www.baidu.com/img/bd_logo1.png from the www.baidu.com page. You should also be able to see the corresponding records in the server access log. At present, only when Baidu crawl a webpage, but also crawl the page: IMG, JS and CSS will take the Referer field. This part of the additional crawl volume, should not occupy Baidu assigned crawl quota, belongs to "Buy 1 get 1".
The meaning for the stationmaster
If you find that there are a number of URLs (IMG,JS,CSS only) error (4xx or 5xx), but have been unable to find the entrance where, that is to say you do not understand where Baidu spiders find these error URLs. This field can help you navigate quickly.
As an example,
For example, our SEO log analysis system can be seen in accordance with the following URL pattern of the path every day there are 60,000 to 100,000 crawl and all reported 404.
1 months have passed since the discovery of the problem, I have not found the entrance to the entire site. Today accidentally carefully check the log, think of Baidu Spider Referer, immediately can locate the problem. These 404 URLs come from a set of pages that no one can maintain or care about (often so). Ingest traffic is good. Due to recent company image system updates, the URL of the picture has changed, but this set of pages has not been updated.
What if the site is not logged referer?
IIS please tick "CS (Referer)" here:
Apache please refer to:
Apache Log Configuration "Combined Log Format" section
Official link to Apache log configuration
Nginx please refer to:
Nginx Log Configuration
Official link to nginx log configuration
How to use Baidu Spider Referer to find error page entry?