Shark sharing search engine theory (i)

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

The search engine, generally refers to is a kind of software system which applies on the Web, it collects and discovers the information on the web with certain strategy, and after processing and organizing the information, provides the user with the Web Information Inquiry service. From a user's perspective, this software system provides a web interface that allows him to submit a word or phrase through the browser, and then quickly returns a list of information that may be relevant to the user's input (often a long list, such as 10,000 entries). Each entry in this list represents a page with at least 3 elements:

Title: The title of the content of a Web page in some way. The simplest way is from the page title tags extracted from the content, but now the search engine is not a single extract from the title, because some of the content of the page and the title does not match, can be understood as one of the current methods of SEO cheating.

URL: The corresponding "Access address" for this page. Experienced web users can often use this element to determine the authority of the content of the Web page, for example, to find the legendary return of the Web site above the content is usually more than to pass down the title of the first spell of the head to be more authoritative (do not rule out the latter content more interesting, or have related content).

Abstract: A summary of the content of a Web page in some way. The easiest way to do this is to intercept the first few bytes of the page's content as a summary, but now most search engines still like to extract from the description tag, but if they don't, it will still be extracted using the above method.

By browsing through these elements, the user is judged on whether the page really contains the information he needs. If you are more certain, you can click on the URL above to get the full text of the webpage. For example, the user submitted a query word "Legendary return plug", the system returned a list of related information. Each item in the list contains more content than the above, but the core is the three elements. If the user is mainly to download from the plug-in to understand that the first item is often the best choice, so now many enterprises are looking for SEO to their own site optimization, of course, some directly please a SEO master back, specifically optimize the corporate website.

This example prompts us an important situation, that is, the search engine to provide information query service, it is only the query word. People with different ideas may submit the same query words, concerned about the different aspects of the information related to this query, but the search engine usually do not know the user background, so the search engine should strive to not miss any relevant information, but also to the "most likely to be concerned" information in front of the list. This is the basic requirements of search engines. In addition, considering the application environment of the search engine is the Web, the response performance of a large number of concurrent user queries is also an area that cannot be neglected.

As a basic understanding of the workings of search engines, there are two issues that need to be clarified first. First, when the user submits the query, the search engine is not immediately on the Web "search" a pass, found that the relevant pages, form the list presented to the user. Instead, it has already "collected" a batch of Web pages, which are stored in a system in some way, while the search is only carried out within the system. Second, when the user feels that one of the items in the results list is likely to be the one he needs, and clicks on the URL to get the full text of the page, he is visiting the original source of the page. So, theoretically, search engines do not guarantee that the title and summary content that users see on the return list is consistent with what he clicks on the URL, or even that the page still exists. This is also an important difference between search engines and traditional information retrieval systems. This distinction stems from the basic characteristics of the aforementioned web information. To make up for this difference, modern search engines keep the full text of Web pages collected and provide a "snapshot" or "History page" link in the return results list, ensuring that users can see the content consistent with the summary information.

Today is the first lecture, the content is more general, in the later study will gradually unfold to speak carefully. Reprint please bring www.csqyzwg.com.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.