Xiao Yuqiang: graphic Search Engine Working Principle

Source: Internet
Author: User
Tags keyword list

If you do not understand how search engines work, it is difficult to properly carry out the work. When I told students about how search engines work in the SEO course a few days ago, many students said they didn't quite understand it. Later, I drew the main workflow of the search engine for everyone. Many people said they understood it ".

Let's first look at the main work of the search engine: Page indexing, PAGE analysis, page sorting, and keyword query. The search engine workflow is: Page indexing-PAGE analysis-page sorting-Keyword query.

I. How search engines work-page recording

  

 

How search engines work-page Indexing Process

The purpose of page indexing is to add content on the website to the URL list and accumulate URL resources.

Step 1: search engine crawlers (commonly known as Spider) discover and come to the website. That is to say, the website must first exist and be discovered by the spider. For example, if Jinan Seo Xiao Yuqiang's blog is to be indexed by search engines, it must first exist and have content.

Step 2: the spider crawls the entrance page and stores the original page of the entrance, including the page capture time, URL, and last modification time. The purpose of storing the original page is to check whether the page is updated next time. The Spider prefers websites that are updated frequently.

Step 3: extract the URL. The extracted URL contains the domain name URL and internal URL. The domain name URL is the home address of the website, such as www. ***. com; the internal URL is the address of each page inside the website, such as http: // www. ***. COM/151.html. The URL resources extracted by the spider are continuously added to the URL list.

Ii. How search engines work-PAGE analysis

In page indexing, the search engine has captured the URL on the website. Next, the search engine will analyze the page content captured.

  

 

How search engines work-PAGE analysis process

In this process, we see two "webpages ". The first "webpage" refers to the URL resources that have been indexed by the search engine just now. Good. The search engine's analysis of the page officially started.

Step 1: extract the body information. The extracted body information includes not only the page content, but also the header Tag Information (Title \ keywords \ descrption) of the page.

Step 2: After the information is extracted, the search engine divides the text information into several keywords Based on the mechanical word segmentation and statistical word segmentation. These keywords constitute a keyword list. When we search for content in a search engine, we usually enter keywords for search. Here, the search engine divides the content into words according to certain rules so that you can search for the content later.

Step 3: the search engine in the previous step has split the body content into several keywords. The location and frequency of these keywords are different. in step 3, search engines record, classify, and index keywords one by one. For example, if we recommend that 2%-8% be the most reasonable keyword frequency, the search engine will consider the keyword that matches 2%-8% as the main keyword of the webpage when classifying the keyword, in this way, take care of the following page sorting.

Step 4: After the search engine creates an index for the page keywords, it re-combines these keywords and reconstructs a new webpage in the form of keywords. The keywords on this webpage are unique, all are not repeated. For example, in step 3, the keyword appears three times. In Step 4, we only record the keyword once. After the restructured webpage, the keyword does not repeat.

Now, the search engine has analyzed the page. In this step, the search engine completes the extraction of page body information, keyword segmentation, keyword indexing, and Web page restructuring from the search engine perspective.

Iii. How search engines work-page sorting

In the previous step, the search engine analyzes the page and reassembles the page in the form of a unique keyword. Next, we will start the page sorting process. The page sorting process is actually completed by the combination of users. When a user enters a keyword in the search engine for query, the search engine starts page sorting. We know that you can find many webpages in search engines by entering any keyword. How is the order of these webpages generated? What are the factors that affect page sorting?

In fact, there are many factors that determine page sorting, such as keywords, page relevance, link weight, and user behavior.

1. Let's look at the keywords first.

A. Keyword matching degree. We noticed that in full-text search engines, the search engine list usually contains the keywords we entered. When we enter a keyword for query, the search engine first checks whether the keyword exists on the webpage. This is the basic condition.

B. Next, the search engine will compare the frequency of keywords on the page, which is too high or too low. The most appropriate frequency is generally considered to be about 2%-8%.

C. Keyword distribution. That is, the position where the keyword appears in the page also affects the page sorting. Generally, the descending order of page weights is top left> top right> top left> bottom right.

D. The weight label of the keyword. Weight, which can be understood as importance. Weight labels, such as <B>, <I>, <em>, and

2. LINk wEight

Internal link. Links between pages within a website. Generally, the homepage has the highest weight. Under the same circumstances, if two websites compare the homepage with the internal page, the general homepage will be placed in front of the internal page.

External link. The link between a website and an out-of-Site Page ". The quantity, quality, and relevance of external links affect page sorting. In terms of page relevance, Google is more rigorous than Baidu. For example, if your website is for it and you have linked many mechanical and chemical websites, search engines will dislike it at this time, you may even think that you have maliciously added external links.

Default weight distribution. The search engine uses the date when the page is crawled as a reference factor. The more links the page receives per unit time, the higher the quality, the higher the quality of the page.

3. user behavior

Users' clicking behaviors on search results are one of the factors that measure page relevance and an important supplement to improving sorting results and improving the quality of sorting results.

4. How search engines work-Keyword Query

  

 

How search engines work -- Keyword Query

Step 1: Enter keywords to search.

Step 2: The Search Engine receives the user keyword command to split the user's keywords again. Some people asked why splitting is required? This is because the keyword entered by the user may not match the keyword in the search engine dictionary. At this time, the search engine will split the user's word again, especially the long tail keyword. Such as user search: Xiao Yuqiang's blog. This word does not exist in the search engine dictionary, so it will be divided into three words: "Xiao Yuqiang" blog "and then matched in web resources.

Step 3: After the search engine splits the keywords, it enters the webpage resources for matching and queries the appropriate content, that is, the "reverse keyword index table ". If a keyword exists in a webpage resource, analyze the page and sort the webpage by page weight. If no keyword exists, an "empty list" is returned ". For example, "Sorry, the content you want to query is not found ".

For the entire search engine process, we can take cooking as an example. For example, if you want to cook tomato and scrambled eggs, start later.

The first step is to include tomatoes, eggs, and ingredients. The corresponding page is shown here;

Step 2: With tomatoes, eggs, and ingredients, let's analyze the cooking sequence. Do you want to put the oil first or put the eggs first and then the oil? Or are you sure you want to put tomatoes, eggs, and oil? Analyze this project, that is, the second step of Search Engine Work: PAGE analysis;

Step 3: sort the page and analyze how this dish is made. Next, you need to start it. First, brush the hot pot, then put the oil, and then continue ...... This is a reasonable order. What should we do first and then.

Step 4: when the food is ready, you may choose to eat eggs or tomatoes first. Haha, this corresponds to the keyword query. If you want to find meat in tomato and eggs, sorry, no. This is an empty list of keywords.

For examples, see. Just understand how search engines work.

My QQ: 2284939775, welcome to exchange.

This article first: Xiao Yuqiang blog http://www.xiaoyuqiang.com/151.html focused on Jinan SEO Research. Indicate the source for reprinting.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.