Implicit semantic indexing LSI search engine principle

Source: Internet
Author: User
Keywords Website optimization SEO

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

There are some things we have been thinking about, such as how the search engine is to judge the original article? Is there a keyword piling up in the article? How can search engines judge the relevance of articles and keywords? and a lot of friends are very puzzled about a problem, because we all know Google's ranking algorithm PR, Seems to not work now, because a lot of seoer have told me that a lot of PR high in some key words ranked instead of the PR value is very low site. So most of the friends think that is not the PR now does not work?

Many friends on the above questions are very puzzled, in the end why by what principle or what mechanism to judge these problems? In fact, we are going to speak today a principle, called LSI (Latent Semantic index) translated into Chinese meaning is implied semantic index.

How implicit semantic indexing works:

When the spider crawls and downloads the site page, Hide semantic index (we check LSI below) make a list of all the words in the download page, and then filter some words without semantics (such as stop words, filter words, etc.) and then make a list of all the pages in the site. You can then use these lists to make a giant matrix with a page (document) x axis and a word y axis. If a word appears on a page, then for the page position we Mark 1, and vice versa for 0. This makes it clear how often each word appears on the entire page.

Of course, only by this is not accurate technology, then LSI will introduce a keyword weight. 1. The higher the frequency of the keywords appearing on the page, the higher the weight of this keyword in this page. 2. Is the whole station keyword frequency high weight is lower.

LSI The most important thing is to calculate the site of a keyword related keywords in other pages appear frequency. The advantage of this is that even if your site page does not appear in your search keyword, it is possible to search the relevant pages. So if you do a keyword ranking or all of your backlinks using the keyword as anchor text, then your reverse link quality will decline, is the role of LSI. There is also your site to do the relevant long tail keywords each other to enhance their competitiveness, because of their relevance. So if you still use the previous method to do optimization, do not do long tail keywords, do not do relevance, then your site keyword is difficult to get a good ranking.

From the above description of the principle we can see, why the search engine is able to show such a good intelligence, although LSI is not understand the meaning of a word, but he calculates a page contains the keywords, and reference to other pages included in the keyword composition. So LSI will come to a conclusion, with many keywords the same page, their page content is also close. So this is why search engines can judge a lot of false original and collected pages of the article, so you do not change the title, because you changed some paragraph search engines do not know. In fact, LSI can be very good judgment.

This is the smart result of LSI.

Then we link LSI after we come to answer the article open some questions, the first few questions do not have to answer, about the PR value now whether the problem, my answer is no. PR algorithm is now still Google's core algorithm, now has not changed, and will only continue to improve, and not change. Then why the PR low site keyword ranking is also higher than the front of the PR? You need to know the PR algorithm, Google calculates a page of the PR value is to calculate all its import links, including the chain. But Google is more interested in links that have dependencies and pages, and irrelevant backlinks will not work when the final page keyword rankings are counted. But these PageRank backlinks are still useful for computing, so this is what happens. PR High keyword ranking in PR low below.

So how do you analyze the dependencies of a reverse link? That's the LSI we talked about earlier.

PS: Search engine is not so inaccessible, his ideas and stationmaster is the same, the purpose is to let users have a better experience. We don't study it to cater to his tastes. But with search engines to win the favor of users. So no matter when you don't forget site user experience (UEO) reproduced please indicate the source http://www.gangguanhb.com, thank you

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.