Use of robots and 404 error pages to effectively reduce the impact of Web site revision

Source: Internet
Author: User
Keywords Impact Revision use

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Before you explain this topic, how do you say that the 404 error page is generated? When the website changes and adjusts, moves or deletes the previous website directory and the webpage, but the user and the search engine spider again accesses these page paths, may discover these pages does not exist, this is the commonly said error page. If your site has correctly set up 404 error pages, check the server log for a 404 status code. The original site a large number of removal, such behavior for visitors and search engines are very unfriendly, is included in the Search engine database page If a large number of disappear, it is very likely to trigger the sandbox and the right to drop the phenomenon.

But in the actual process, for the long-term development of the site, site adjustment and revision are difficult to avoid. Chou Cat recently bought an old domain name to build a skincare website, the next day to view the IIS log found a large number of spiders daily crawl The original site page, but those pages and directories do not exist, in order to avoid the domain name into the sandbox and the right to drop, the proposed conditional friends on the wrong directory and page 301 Permanent Redirect, I took the 404 error page to reduce the impact on the new site:

First, make 404 error Tips Page

Make 404 pages in eye-catching position to be clearly marked "access to the page is no longer exist, you can return to the XXX site home page to browse what you need", and in this note to add Home Address link to guide users, on the one hand can reduce the loss of users, on the other hand and search engine spider dialogue Tell the spider this is a wrong address. Many friends will 404 error page jump to the home page, Chou Cat that this is the existence of risk is not desirable, will be mistaken for the search engine to do harm caused by the home page down right.

Two, 404 error page Test

When the page is made, it is named 404.htm or other suffix incoming Web site root directory (because individual virtual space requirements are not the same as the detailed view of the space description and ask the host), upload completed in the virtual Host management panel to set the custom 404 error page path. Passed up after not finished, there is more important work, that is to test the validity of 404 error pages, many sites provide "HTTP status query", enter a non-existent page and or directory and then detect, when the return status code of 404 indicates that is valid, If a nonexistent path returns a status code of 200 then it needs to be noted that your settings are invalid or your host custom 404 error page has problems, you need to contact the host to resolve. The following figure sets the status code for the 404 error page to be correctly set.

  

Third, set the robots to prevent the crawl error page

For a full-featured and friendly new website, 404 error pages are required, but I am now in the case of light production of 404 is Not enough. 301 directional Too troublesome, later thought of the robots, this is the website and search engine spider dialogue an important document, then I told the spider in this file to prohibit crawling non-existent directories and Web pages. View the space access log found that spiders are mainly caught in the name of the Mynist directory of files, this directory is the previous static Web site directory, make it clear to do, in the robots file add the following statement disallow:/mynist/, meaning that the directory is prohibited to crawl any files , and then look at the logs and add the Non-existent directories and page one by one to the robots file. Baidu Webmaster Club Lee mentioned that the newly added statement will not immediately take effect in the process, the normal situation within a week, so the spider will still crawl in a few days after the banned page is normal.

During this period I update three original content for the website every day, and do a small amount of outside the chain every day, about 10 days after the first snapshot of the site to change the next day, the content page has been released by Baidu, this result shows that my approach is correct, I bought the old domain name has not been lowered right and into the sandbox long observation. So today the method to share out, hope to be able to give the website revision friend to make a reference and to provide some help. This article by Taobao crack silk women http://www.21rip.info Small Series Chou Cat original starting, if you need to reprint, please retain the copyright.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.