Web site revision brings 404 crawl error resolution Practice

Source: Internet
Author: User
Keywords solution revision practice crawl

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Website construction An important work is the revision, each revision must be to the good aspect, because each stage localization is different, we want the website to show the enterprise image also to be different, moreover a good-looking atmosphere website definitely can in the latent customer heart enhancement our company status. But most of the revision of our site SEO will have a certain impact, here we need to grasp the situation of the site is very familiar with, and then effective control, reduce the number of 404 pages of the page is too much.

Recently there is a new website online, because is more than a year's domain name, before the site and the new station content completely different, the site structure also made a great adjustment, so brought a lot of 404 error crawl page, then did not pay special attention to this problem, and then continued to update the 2 weeks, found that the snapshot has not been updated, A few simple outside chain also has no effect, this question aroused my attention. The following specific analysis of their own ideas:

1, use the Log analysis tool to find 404 error crawl Page

Log analysis tools to use the most commonly used light-years log analysis tool can be, first of all, using FTP to download the site log log of recent days, of course, want more analysis, so many downloads log log can also be used for a log analysis tool to create a new task analysis of the various stages of Baidu Spider Crawl, Here the main consideration Baidu crawl situation, because the premise of this analysis work is snapshot stagnation, included 1.

Here we recommend that you divide the analysis into three periods:

A, analysis of the last day log logs, can be today, but preferably yesterday, because yesterday's will be more complete, today you even in the evening to analysis will be a part of the time does not count.

B, the analysis of the log log after the revision, because this involves Baidu spider for some of the site revision of the judgment, such as we can analyze Baidu when the spider began to judge the site has been revised, or when has given up on the old station URL crawl and so on.

C, the revision before and after the comparison of the amount of capture, analysis of the revision of Baidu spiders to grasp the impact of how much.

As for the analysis behind the log analysis work is one-click, after the specific analysis of the idea, we will follow the analysis, we would find a lot of ordinary we do not notice the problem, such as the following out of the 404 pages, and many of the page is also 404 errors, I did not realize that For example, the following wp-login.php page is one of the most typical examples:

  

404 Error Crawl Page

2, the use of Baidu Webmaster Tools dead chain submission tool for dead chain submission

Baidu Webmaster Platform Lee team said: 404 status Code represents ' not Found ', spider Update will think that the page has been invalidated, at this time will be deleted in the index library, in the short term spider again found that the URL will no longer crawl. Of course, Baidu's argument can only be used for reference, because the analysis of Web log found that 2 weeks Baidu spider or to crawl these error pages, of course, Baidu for the 404 error page of the guidance of the operation, or very targeted.

  

Baidu Webmaster platform on the 404 Page view

In particular, dead chain submission tool submitted dead chain Sitemap, this one can be based on their own situation to death chain submitted, I submitted this side has not been a big effect, because we all know that the effect of Baidu's performance cycle is generally relatively long.

3, using robots.txt and nofollow tag to guide the spider crawl

404 error Page One of the biggest disadvantages is to bring some wrong crawling to spiders, a waste of spiders crawl resources, for example, first of all we have to reach such a consensus: any Web site spiders crawl access resources are limited, small sites naturally a lot less, and large sites will be much more, to the spider crawl rate higher, Crawl more reasonable, then some wrong links caused by the 404 error amount should be reduced as far as possible.

So I am here for the site of the waste of these resources for the appropriate guidance, so that spiders crawl I want him to grab some of the pages, for/wuchenshi/,/gaoxiao/, and other similar columns of the page are restricted crawl, For some sites do not participate in the rankings of the link implementation nofollow, guide spiders crawl important pages. Below look at the spider No. 6.3 crawl situation, first catalog crawl has no site does not exist in the directory:

  

Spiders ' Crawling of directories

For spiders to visit the 404 pages, there is only one image of the 404 error Crawl:

  

Improved 404 error Crawl

At present, there is no snapshot update and included in the increase, of course, theoretically this operation should help the site to get the search engine faster recognition, if there is a recovery, will be in the article for everyone to do a supplement.

This article by the Virtual Rain Network (http://www.xuziyu.com) SEO xu Zi rain published, Welcome to reprint, reprint, please indicate the source, thank you for your cooperation!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.