"Go" inverted index

Source: Internet
Author: User

Original link http://www.cnblogs.com/surgewong/p/3351863.html

The Inverted index (inverted index), as the name implies, is a reverse index.

First, let's take a look at the index concept, the index is like the book directory, through the directory can quickly find the desired chapter.

The inverted index is equivalent to knowing the contents of the chapter, so you can find information about the directory.

Perhaps this analogy is not very clear, then we will give a simple example to illustrate.

Suppose we have three words:

 T[0] = "It is the It is"

T[1] = "What's It"

T[2] = "It is a banana"

Here, our index is established between the location (position) and the word (word) .

Regular indexing refers to finding the words by location, such as: t[0] The first word is it, can be recorded as (0,0): "It", Again (2,1): "is".

  The inverted index, in turn, gets the position through the word, such as: where the word "it" appears (0,0) (0,3) (2,0),

This can be remembered as "it": {(0,0) (0,3) (2,0)}.

By creating inverted indexes on the above three sentences, you can get:

' A ': {(2,2)}

"Banana": {(2,3)}

' Is ': {(0,1) (0,4) (2,1)}

"It": {(0,0) (0,3) (2,0)}.

"What": {(0,2) (1,0)}

  by building a good inverted index, we can easily implement the retrieval of the statement,

For example, you need to retrieve the statement containing "what" and "is" "it" three words, ignoring the second digit in the inverted list (the position of the word in each sentence).

You can get {0 1}∩{0 1 2}∩{0 1 2} = {0 1}, so we conclude that t[0] and t[1] meet the conditions .

You also need to take into account the exact location of the word when retrieving the phrase "What's it"

So we can only get to t[1] to meet the conditions {(1,0) ()}.

In conclusion, the above analysis can tell us that the retrieval of words or sentences can be transformed into a set solution after the inverted index is constructed.

Instead of a line-by-word scan, this makes retrieval efficiency much better, which is why inverted indexes are so important in the search field.

There is also a problem in front of it, creating an inverted index is very time-consuming. Fortunately, this process can be done offline.

For more information please refer to Baidu Wikipedia, Baidu Encyclopedia, and related papers, etc.

"Go" inverted index

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.