How to use Baidu Spider Referer to find error page entry?

Source: Internet
Author: User
Tags apache log

Everyone should know that Baidu has been the entire station HTTPS and cancel Referer keyword display (details can see Webmaster's home article: Baidu Site Property function upgrade completely cancel Referer keyword display), then "Baidu Spider Referer" is what? Is there anything magical about it? Art Dragon SEO leader Liu Ming found through the Baidu Spider Referer can quickly locate part of the Site URL error (4xx or 5xx) reason.

What is the referer of Baidu Spider

Baidu Spider Referer, refers to when the Baidu spider crawl a URL, in the HTTP header with the Referer field. Please note that this definition has nothing to do with Baidu's recent statement to remove the Referer keyword data. This is the spider initiated by the HTTP request, Baidu and the removal of the user initiated. If Baidu Spider crawl Baidu homepage logo, will initiate such a request:

The Referer field above clearly indicates that he found and crawled www.baidu.com/img/bd_logo1.png from the www.baidu.com page. You should also be able to see the corresponding records in the server access log. At present, only when Baidu crawl a webpage, but also crawl the page: IMG, JS and CSS will take the Referer field. This part of the additional crawl volume, should not occupy Baidu assigned crawl quota, belongs to "Buy 1 get 1".

The meaning for the stationmaster

If you find that there are a number of URLs (IMG,JS,CSS only) error (4xx or 5xx), but have been unable to find the entrance where, that is to say you do not understand where Baidu spiders find these error URLs. This field can help you navigate quickly.

As an example,

For example, our SEO log analysis system can be seen in accordance with the following URL pattern of the path every day there are 60,000 to 100,000 crawl and all reported 404.

1 months have passed since the discovery of the problem, I have not found the entrance to the entire site. Today accidentally carefully check the log, think of Baidu Spider Referer, immediately can locate the problem. These 404 URLs come from a set of pages that no one can maintain or care about (often so). Ingest traffic is good. Due to recent company image system updates, the URL of the picture has changed, but this set of pages has not been updated.

What if the site is not logged referer?

IIS please tick "CS (Referer)" here:

Apache please refer to:

Apache Log Configuration "Combined Log Format" section

Official link to Apache log configuration

Nginx please refer to:

Nginx Log Configuration

Official link to nginx log configuration

How to use Baidu Spider Referer to find error page entry?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.