Solve spider crawl failure caused by server

Source: Internet
Author: User
Keywords Server

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Server is the basis for the survival of the site, no matter what the cause of the server ban, have a direct impact on the spiders crawl, the impact of the site's user experience, not conducive to the spread of SEO work. Chongqing SEO game will be its own personal experience, combined with some friends on the network analysis of such problems, summed up the server banned three main reasons:

Server instability

Now the server a dime, the price is also different, quality is far from the difference, stationmaster often in the choice of time are "only recognized price not quality", some space business in order to save resources, deliberately shielding off spider (spider) IP, resulting in spider crawl failed, the site page can not be indexed by search engines.

Solution: Choose the normal space of the strength of the business, as far as possible to ensure that your website stability. The stability of the server and space needs certain technical strength to protect, some lack of strength of space business, may not be able to provide good service, the stability of the service can not be guaranteed. We can be like a metaphor: if the "people" metaphor for the content of the site, then the server is our "home", it provides us with the wind and rain, for our survival to provide a good environment, and the quality of the server will affect our ability to withstand the risk. I think, no one would like to live in a house without security, to take their own lives joking, the same, the site is also so! If your current server is not ideal, it is necessary to choose a new one, please temporarily let the old server to use for a period of time, and do 301 jump, as far as possible to reduce the number of replacement server caused by a series of losses.

Ii. human Error in operation

On the search engine spiders don't know enough, to some impersonate the search engine spider IP can not correctly judge, and then mistakenly sealed search engine IP. This will lead to the search engine can not successfully crawl the site, can not successfully crawl the new page, and will have been successfully crawled and included in the page to determine the invalid link, and then the search engine will be the death of these links to clear, and then the site page included reduced, resulting in the site in the search engine

Solution: Correct understanding of the search engine spiders, and the search engine uses the IP address will change at any time, in order to ensure that the search engine IP can be correctly identified, you can use the DNS reverse search method to determine whether the source of crawling IP is a formal search engine, to prevent false seal.

For example, check Baiduspider: Under the Linux platform, you can use host IP command to reverse IP to determine whether from the Baiduspide crawl. Baiduspider hostname are named in *.baidu.com or *.baidu.jp format, not *.baidu.com or *.baidu.jp as impersonation.

$ host 123.125.66.120

120.66.125.123.IN-ADDR.ARPA Domain name pointer

Baiduspider-123-125-66-120.crawl.baidu.com.

Host 119.63.195.254

254.195.63.119.IN-ADDR.ARPA Domain name pointer

baidumobaider-119-63-195-254.crawl.baidu.jp.

Third, protective accidental ban.

1, if the site's traffic is too large, more than their own load, the server will be based on their own load protection accidentally banned. This ban is transient, as long as the amount of traffic dropped to the server to bear the scope, then the server will work correctly.

2, there is a situation is caused by spider, search engines in order to achieve a better target resource retrieval effect, spiders need to keep a certain amount of your site crawl. Search engine will be based on the capacity of the server, site quality, website updates and other comprehensive factors to adjust to establish a reasonable site to grab the pressure. There are exceptions, however, where the pressure control is not good, the server will be based on its own load protection accidentally banned.

Solution: 1, if the amount of traffic caused by the pressure, then congratulations, that your site has a considerable number of visitors, we should upgrade the server to meet the increase in the amount of traffic. 2, if it is caused by spider, we can use the following way to reduce the pressure on the server spider: A, the use of robots files, shielding off the page do not want to be spider crawled. B, the use of nofollow tags, shielding do not want to be spider crawling links. C, the page longer CSS, JS code to move to external files. D, the deletion of redundant code. It should be noted that the above two cases in the returned code as far as possible not to use 404, recommended return 503 (meaning "Service unavailable"). So spider will try to crawl this link over time, if the time site is idle, it will be successfully crawled.

Finally, I hope the webmaster to keep the site stable, for the moment do not want to be crawled by the search engine pages, using the correct return code to inform the search engine, if you do not want to be search engine crawl or index, you can write a message to inform the robots.

Author: Yi Shan its

Source: Chongqing SEO

Article links: This article from http://www.137sv.com/seotechnique/27.html reproduced please specify the source, and retain the integrity of the source, thank you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.