Search engine Spider Crawl Law two: whether the chain has timeliness

Source: Internet
Author: User
Keywords Quest

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

  

"Search engine spider Crawl law one of the secrets of spiders How to crawl the link" write the distance today has been more than 20 days, would have been writing down, but after the first article, suddenly no idea. Today with friends talk about the timeliness of the chain, that is, outside the chain will not fail.

This article no longer discusses the relevant content of the theory, but will give some examples to prove the first article, but also say the timeliness of the link.

First of all, the outer chain of the page was deleted, the external chain is effective?

The answer to the outside of the chain of the page deleted is still valid. The evidence is as follows:

  

My blog China blog (probably because the traffic exceeded in 2006 was deleted) has long been deleted, but Baidu still has a snapshot. The snapshot of the next page is gone today, but the article page still exists. Look at the snapshot date can be seen in 2006, or even longer.

That is, although the page has been deleted for 5 years, but the snapshot of Baidu did not delete, then you said that this link spider will not crawl it?

I feel that should be crawling, and I blog in the web of this blog is a domain name a link, at that time just made a jump to the blog home page. Later, when I enabled domain A to do a blog, I immediately got a good weight, and the article is very easy to accept seconds. I believe this link 5 years ago played a lot of role.

Second, if the chain on the page search engine without a snapshot, outside the chain is effective?

The answer may surprise many people, the outer chain of the page without a snapshot can still be effective. Reasons can be seen in the spider How to crawl the link in this article, Spiders crawl the page, will be the content and links, link is the URL will add a Web site index library, and spider crawling from this URL index library set off.

First look at the evidence, this evidence from Google Webmaster tools:

  

  

This screenshot from the Google Webmaster tool fault diagnosis 404 report, before I set up a BBS under the original site, of course, as early as the N years ago has been deleted. But this does not exist page, by Google Spiders Crawl Source address unexpectedly is also nonexistent page. With Google search, there is no snapshot of these pages (pictured below). Does that mean that the export links on the page that have been 404 years old are still valid?

  

Third, then outside the chain for the search engine is timeliness?

Obviously, it should be time-sensitive. Then I guess the reason for the chain failure, there should be two reasons: that is, the chain of the page is deleted or links deleted.

1. For the deletion of the page, the search engine should continue to crawl this page on the outside chain, until this page 4,041 set time, will give search engine URL index library a command to delete this outside the chain.

2. In the case of page changes, search engines should also crawl the outside chain, until this contains this outside the chain of snapshots within the search engine completely deleted, will give the URL index library a command to delete this outside the chain. Because the page containing the chain will save n time snapshots According to the situation, this is why 8630.html "> sometimes search different words, the snapshot of the page is different."

In short, the chain is time-sensitive, but link modification or page deletion does not mean that the invalid. Of course, the search engine will have a complex internal calculation, the process will not be as simple as I said. If there is any disagreement, please leave a message directly under this article, we discuss each other.

Reprint please specify from the carefree blog, this article address: http://liboseo.com/1111.html

Respect for copyright, reproduced please specify the source and link!

Related articles: Search engine spider crawl law one of the secrets of spiders how to crawl links

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.