Sho Yuqiang: Graphic search engine working principle

Source: Internet
Author: User
Tags contains continue key words sort domain

Do SEO, if not understand the working principle of search engine is difficult to properly carry out the work. A few days ago to the students to talk about SEO courses in the search engine work principle, many students said not quite understand. Later I drew the main work flow diagram of the search engine to everyone, many students said "understand."

We first look at the main search engine work: page collection, page analysis, page sorting and keyword query. Search engine Workflow is: page included-page analysis-page sorting-keyword query.

first, search engine work principle--page collects


Search engine working principle schematic diagram--page collects the process

The final purpose of the page is to add the content of the Web site to the URL list and accumulate URL resources.

The first step: Search engine Crawling program (commonly known as spiders) found the site, came to the site. That is to say, the site must first exist, and can be found by spiders. For example, Jinan seo Sho Yuqiang Blog If you want to be indexed by search engines, the first to exist and to have content.

The second step: Spiders start to crawl the entry page, and store the entrance of the original page, including the page crawl time, URL, the last modification time. The purpose of storing the original page is for the next time to arrive than on the page if there is an update, spiders like to update the site frequently.

Step three: Extract the URL, the extraction URL contains two two content: Domain name URL and internal URL. The domain name URL is the homepage address of the website, such as www.***.com, the internal URL is the address of each page inside the website, such as http://www.***.com/151.html. The URL resource that the spider extracts will continue to be added to the URL list.

second, search engine work principle--page analysis

Included in the page, the search engine has crawled to the URL on the site, next, the search engine will be crawled to the content of the page analysis.


Search engine work schematic diagram-page analysis process

In this process, we saw two "Web pages". The first "webpage" refers to the URL resources that have been included in the search engine just now. Well, the search engine's analysis of the page officially begins.

The first step: Extract the body information. The body information extracted here, in addition to the content of the page, contains the header label information (title\keywords\descrption) of the page.

The second step: after the extraction of information, search engine according to the mechanical segmentation method and statistical segmentation method, the body information cut into a number of keywords, these keywords constitute a list of keywords. We all find content in search engines often input keyword lookup, here search engine work is according to a certain rule of the content divided into words, so that after everyone search.

The third step: The last step of the search engine has been the content of the text into a number of keywords, these keywords appear in the position, frequency and so is different, in the third step, the search engine will be the key words one by one record, classify, establish index. For example, the frequency of the keyword appears we suggest that 2%--8% is the most reasonable, then the search engine in the keyword classification, will think that the key words in line with 2%--8% is the main keyword of the page, so in the next page sorting to give care.

The fourth step: Search engine for the page key words index, then will these keywords regroup, in the form of keywords to rebuild a new page, this page is the only keyword, all do not repeat. For example, we just in the third step, a keyword appeared three times, in the fourth step, we only record a keyword 1 times, after the reorganization of the page, a keyword no repetition.

So far, the search engine on the page analysis completed, in this link, the search engine completed the page body information extraction, keyword segmentation, keyword index, and search engine angle of the Web page reorganization.

third, search engine work principle--page sort

In the above link, the search engine completed the analysis of the page, the page as a unique keyword in the form of a regroup. The next step is to get into the sort of page. The link of the page sort is actually done by the user to cooperate. When the user in search engine input keyword to query, the search engine began to sort the work of the page. We know that any input of a keyword can be found in the search engine a lot of pages, the order of these pages is how to produce? What are the factors that affect page sorting?

In fact, there are many factors that determine page ordering, such as keywords, page dependencies, link weights, and user behavior.

1, first look at the key words.

A, keyword matching degree. We note that in the Full-text search engine, the list of search engines will normally contain the keywords we have entered. When we enter a keyword to query, the search engine will first check whether the page has the keyword, which is the basic condition.

b, Next, the search engine will go to the page than the frequency of the keywords appear, too high or too low are bad, the most appropriate frequency is generally considered to be 2%--8% around.

C, the distribution of keywords. That is, the location where the keyword appears on the page can also affect the sort of page. It is generally considered that the descending order of page weights is on the upper left > right > Left > Right > Left down > Right.

D, the keyword weight tag. Weight, can be understood as important. Weight tags such as < b >, < i >, < em >, < h1 >-< h6 > etc. these tags make the text in the label different from the other text, the search engine will give the corresponding weight promotion.

2, Link weight

Internal links. The link between the internal pages of the site, the most weight of the general home page. In the same case, if there are two sites in the home page and the internal pages for comparison, the General page will be ranked in front of the pages.

External links. The link between the website and the outside of the station, the popular saying is called "outside Chain". The number, quality, and relevance of the chain will affect the sorting of the pages. In page relevance, Google is more stringent than Baidu, for example, your site is to do it, the result you go to link a lot of mechanical and chemical sites, the search engine will be very dislike, and even think you malicious add external links.

The default weight assignment. Search engine will be the date of the page being crawled as a reference factor, the page in the unit time to get the number of links, the higher the quality, the quality of the page is also relatively higher.

3, user behavior

The user's click behavior of search results is one of the factors that measure the relevance of the page, and it is an important supplement to perfect the sorting result and improve the quality of the order result.

Four, search engine working principle--Keyword inquiry


Search engine working principle--Keyword inquiry

The first step: User input keywords to query.

The second step: Search engine received user keyword instructions, the user's keywords again segmentation. Some students asked why to cut it? This is because the user input keywords May and search engine dictionary keyword does not match, this time search engine will be the user's word again, especially long tail keyword. such as user search: Sho Yuqiang's blog. The word search engine dictionary is not, then the word will be cut into "Sho Yuqiang" "" blog Three words and then go to the Web resources to match.

The third step: Search engine will be the keyword segmentation, into the Web page resources to match, query the appropriate content, that is, "Keyword Reverse index table." If the page resources have corresponding keywords, then page analysis, according to the page weight of the page sorting. If there is no corresponding keyword, return to the customer with an "empty list." such as "Sorry, did not find the content you want to query."

The whole search engine is working on the process and we can take examples for cooking. For example, we are now going to make scrambled eggs with tomatoes, and then start.

The first step, tomatoes and eggs, as well as the ingredients to have, here corresponding pages included;

The second step, with tomatoes, eggs, ingredients, and then we have to analyze the order of cooking, is the first to put the oil or first put eggs and then put the oil? or put the tomato eggs and then put the oil? Analysis of this project, that is, the second step of search engine work: page analysis;

The third step of the page sorting, analysis of how to do this dish, the next to do, first brush pot hot pot, then put oil, and then continue ... It's a reasonable order, what to do first, what to do after.

Fourth step: The dishes are done, on the table, you may choose to eat eggs first, may also eat tomatoes First, haha, this corresponds to the keyword query. If you want to find a piece of meat in the tomato eggs, sorry, no, this is the empty list of keywords.

For examples, please refer to them as appropriate. Understand the search engine work principle is good.

My qq:2284939775, welcome to the exchange.

This article starts: Sho Yuqiang's blog focus on Jinan SEO research. Reprint please indicate the source.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.