. NET face Question series (13) Lucene underlying principle

Source: Internet
Author: User
Indexing principle

Full-Text search technology has a long history, the vast majority are based on inverted index to do, there have been some other programs such as file fingerprints. Inverted index, as the name implies, it is the opposite of an article contains what words, it starts from the word, it records the word in which documents appear, consisting of two parts-dictionary and inverted list.

The dictionary structure is particularly important, there are many kinds of dictionary structure, each has its advantages and disadvantages, the simplest such as a sorted array, through the binary search to retrieve data, faster with a hash table, disk to find a B-tree, plus tree, but a can support terabytes of data in the inverted index structure needs to have a balance in time and space, Lists the pros and cons of some common dictionaries:

FST
The index structure that Lucene uses now

. NET face Question series (13) Lucene underlying principle

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.