The principle of search engine and Web text segmentation

Source: Internet
Author: User

for the SEO staff, the main goal of their work is search engine, so a deep understanding of the search engine operating mechanism to help us optimize for search engines, which is equivalent to the two countries jiaobing, must know each other's actual situation, and then analyze their own advantages, and then can stationing destroy each other,    If you do not know the other person's actual situation, others barely, then you fail is certain! and in the analysis of search engine, know its operating mechanism and word segmentation technology is very important!
The first step in search engine work: extracting page text
The first is to crawl the text of the page, in general search engines will be related to the corresponding words extracted from the keyword, there are meta tags and so on, there are keywords and descriptions and pictures of the ATL attributes, etc., this alt attribute is required to the user to the mouse to the picture to be able to see, There is also a Web page related text, so many flash sites in the search engine optimization will eat a lot of losses, because there is not a lot of text, and search engines will not crawl the Flash source code! So a lot of flash site optimization will basically be compiled a set of source programs, Let the relevant text and content correspond, so as to be able to be recognized by search engines!
The second step of search engine work: Chinese word segmentation technology
when the search engine after the text crawl, the next job is to divide these words into a word, say a word broken down into a phrase, such as the word of the Monkey King of the phrase, will be divided into the big holy and Monkey King two words, also such as: Willow as Cold Moon, We can take a look at this Baidu and Google's word difference!
The two search results are different, Google is more inclined to be Liu Rushi as a noun, so in Liu Rushi is the first to paste the match! And for Baidu, directly to the willow as the cold month of the word has become the willow, if and cold month, so the relevant Liu Rushi is not posted on the home page, Why are there such obvious differences? The key is that Google does not have a proprietary dictionary, so the matching method will have some differences, we want to target different search engines to optimize the keyword, in the content to try to close the key words, and not be able to let the key words and content separated, so the keyword ranking is very difficult to go up!
The second step of search engine work: matching technology
A: Positive matching, the above willow as cold month is a positive match, this matching method helps to eliminate ambiguity, so that the results of the search more accurate, and will not be the willow as, become Liu Rushi is!
Second: Inverse matching, which is a method of matching from back to forward.
three: To maximize the match, such as the United States of America is free, the largest match has become the United States of America, Freedom!
Four: Minimize the match, still take the United States of America is free, the smallest match has become the United States, Lee, United, national, free, and in the actual search engine segmentation process, will be a combination of these several matching methods, will not only use one of them, search engine segmentation technology The ultimate goal is only two points, We're going to run this two. Search engine optimization can help improve the ranking of the site! One is to eliminate the ambiguity in the text through various matching techniques, so that the search word out of the content more accurate and complete! The second is to use a variety of matching ways to put some names, Place names and institutional names, as well as some from non-landing words such as mantra, buzzwords and so on to statistics, and then the results of the statistics and users want to understand the content of different ways to match, so that users get their own desired content!

The principle of search engine and Web text segmentation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.