Understand the relationship between index, collection and site

Source: Internet
Author: User
Keywords Included quantity index quantity we different
Tags abstract data different index problems search search engine seo

Abstract: The same as a search engine principle books, different people may have different understanding, and some of the previous SEO theory, although it has been deeply rooted, but in the current SEO situation has been less practical, such as a collection of problems. Actually want to be accurate

The same search engine principles of the book, different people may have different understanding, and some of the previous SEO theory, although it has been deeply rooted, but in the current SEO situation has been less practical, such as a collection of problems. In fact, want to accurately understand the index, collection, site of the relationship between these, we can start from the basic principle to consider some problems.

From the search principle, spiders first crawl to a Web page URL, and then the URL corresponding to the content of the Web page to download, analysis, to meet its quality standards or have a certain use of the page index, the index of the page into the index database. At this time, some of the pages in the index library have users to retrieve the value of some of the search engines have their own value, for the user has to retrieve the value of the index of the page, will be output, that is, we said included. But for the search engine itself to retrieve the value of the page may not output, only a certain amount of index and no output results, so we can see a lot of times the collection will be much lower than the index.

From the point of view of the search, the number of pages in a Web site is sometimes greater than the number of pages currently owned. For example, a station has 100 pages, for the user or webmaster, is 100 pages, but these 100 pages may have data updates, web changes and other operations, different versions may be able to meet different needs (so we can also see a lot of times when a page has a different snapshot.) From this point of view, in the eyes of the search for a Web site can be larger than the number of pages currently owned by the site, especially for frequently modified sites or URLs are not standardized sites. At the same time, in the search engine data, the data can be composed of historical data and updated data, so site-related results are greater than the number of site results.

According to the above statement, we have to comb the relationship between the four:

Index quantity and Quantity: Index is the collection of all the pages that are worth searching for. Some of these pages are valuable to the user, the output of these pages is included (different people may be defined differently), some pages only on the search engine itself has value, the number of pages caused by the index is higher than the amount of value.

Site results and associated result values: We often see the site results in the following figure:

We see a problem, the number of related results is 215, and the site result is only about 40, the difference is very large. The cause of the gap may be caused by a number of factors, such as some Web pages may be repeated calculation, some Web pages although it is included (the search value is some) but the quality of the page is not high (page value and retrieval value is not the same thing, web search value is only a basis for the value of and Web page value is composed of many factors. )

At the same time we also need to know that spiders, after all, is a machine, and the Internet many sites on the number of Web pages in different changes, has been a new page generation and the deletion of old pages, we see at some point the value is a roughly accurate value, not 100% accurate.

From the inclusion relationship, the relationships of the four are generally as follows:

Index is larger than the collection, the amount is greater than the number of site results, and the number of related results is greater than the number of site results. In general, however, we personally recommend simplifying these relationships in the following ways:

1, Baidu index = Baidu indexed amount, because the amount is actually not see, site results and related results can not represent the value of the amount collected.

2,site Direct result number is of great significance and value to SEO, in addition to the number of site results can be used to judge some of the value of the page, in the collection we suggest to improve the number of site results and Baidu index ratio, Baidu index and the total number of Web pages ratio, from these 2 ratios to start SEO optimization and operation. As for the concept of the Tangle knot, this kind of, direct disregard is good.

Origin SEO Forum http://www.wocaoseo.com/original.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.