Lucene underlying data structure-the underlying filter Bitset principle, time series data compression compresses the same data to a single line

Last Update:2016-12-27 Source: Internet

Author: User

Tags bitset disk usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

How do I federate an index query?

So given the query filter age=18 process is to find 18 from the term index in the general position of dictionary, and then from the term dictionary to find exactly 18 of terms, and then get a posting A list or a pointer to the posting list position. Then the process of querying gender= women is similar. Finally, age=18 and gender= woman is the two posting list to do a "and" merge .

This theoretical "and" merging operation is not easy. For MySQL, if you indexed both the age and gender two fields, the query would only select the most selective, and then the other condition would be filtered out in memory after the row was traversed. So how can you use two indexes together? There are two ways to do this:

Use the Skip list data structure. At the same time traverse gender and the posting list of age, skip each other;
Using the BITSET data structure, the gender and age two filter respectively to find out Bitset, two bitset do an operation.

PostgreSQL supports the use of the Bitset data structure to support a combination of two indexes through bitmap, starting with version 8.4. Of course some business-relational databases also support similar federated indexing capabilities. Elasticsearch supports both of these federated indexing methods, and if the query's filter is cached in memory (in the form of Bitset), then the merge is two bitset and. If the filter for the query is not cached, then use the Skip list to traverse the posting list of two on disk.

Using the Skip List to merge

The above is the three posting list. We now need to combine them with and to derive the intersection of the posting list. First select the shortest posting list and then traverse from small to large. The traversal process can skip some elements, such as when we traverse the Green 13, we can skip the Blue 3, because 3:13 is small.

The whole process is as follows

MATCH!!!, 2Advance (2), 13Advance (All), 13Already on 13Advance, Next Next, 17Advance (+), 22Advance (98), 98Advance (98)--98 MATCH!!!

The final intersection is [13,98], which takes much faster than a full traversal of three posting lists. But the premise is that each list needs to indicate the advance this operation, the position of the fast moving point. What kind of list can advance forward to do the frog jump? Skip list:

Conceptually, for a very long posting list, for example:

[1,3,13,101,105,108,255,256,257]

We can divide this list into three blocks:

[1,3,13] [101,105,108] [255,256,257]

You can then build the second layer of the skip list:

[1,101,255]

1,101,255 points themselves to the corresponding block respectively. This will quickly point the way across the block's movement.

Lucene will naturally compress the block again. The compression method is called frame of reference encoding. Examples are as follows:

Consider the frequently occurring term (the value of the so-called low cardinality), such as a male or female in gender. If there are 1 million documents, then there will be 500,000 int values in the posting list where the gender is male. Compressing with frame of reference encoding can greatly reduce disk usage. This optimization is very important for reducing the size of the index. Because this frame of reference is encoded with a decompression cost. The Skip list, in addition to skipping the cost of traversal, also skips understanding the process of compressing these compressed blocks, thus saving the CPU.

Merging with Bitset

Bitset is a very intuitive data structure that corresponds to posting list such as:

[1,3,4,7,10]

The corresponding Bitset is:

[1,0,1,1,0,0,1,0,0,1]

Each document is sorted by the document ID and corresponds to one bit. The Bitset itself has the feature of compression, which can represent 8 documents with a byte. So 1 million documents require only 125,000 bytes. But considering that there may be billions of of documents, it's still a luxury to keep bitset in memory. And for each filter to consume a bitset, such as age=18 cache is a bitset,18<=age<25 is another filter cache also want a bitset.

So the trick is to have a data structure:

Can be very compressed to save billions of bits to indicate whether the corresponding document matching filter;
This compressed bitset can still operate quickly and logically with and and OR.

This data structure used by Lucene is called Roaring Bitmap.

The idea of its compression is actually very simple. Instead of saving 100 0, take 100 bit. It's better to save 01 times and then declare that 0 repeats 100 times .

Both of these combinations use indexes in a way that uses them. Elasticsearch has a detailed comparison of its performance (Https://www.elastic.co/blog/frame-of-reference-and-roaring-bitmaps). The simple conclusion is that because frame of reference encoding is so efficient, it is not as fast as a bitset that requires access to the disk's skip list for simple equality condition filtering to cache pure memory .

How do I reduce the number of documents?

A common way to compress storage time series is to combine multiple data points into one line. Opentsdb a great way to support huge amounts of data is to regularly merge many rows of data into one line, a process called compaction. similar vivdcortext use MySQL storage, but also a minute of a lot of data points are stored in the MySQL line to reduce the number of rows. For example, it is possible to package many data points for a period of time into a parent document and become nested subdocuments. Examples are as follows:

{timestamp:12:05:01, Idc:sz, value1:10,value2:11} {timestamp:12:05:02, Idc:sz, Value1:9,value2:9} {timestamp:12:05:02, Idc:sz, value1:18,value:17}

Can be packaged as:

{max_timestamp:12:05:02, min_timestamp:1205:01, idc:sz,records: [{timestamp:12:05:01, value1:10,value2:11}{ Timestamp:12:05:02, value1:9,value2:9}{timestamp:12:05:02, value1:18,value:17}]}

This reduces the size of the index by moving the common dimension field of the data point to the parent document instead of repeating the storage in each sub-document. If we can insert 50 nested documents into a parent document, then the posting list can become the previous 1/50 .

Summary and thinking

Elasticsearch's index idea:

Move the contents of the disk into memory as much as possible, reducing the number of random disk reads (and also using the disk sequential read feature).

For indexing with elasticsearch, you need to be aware of:

Fields that do not need to be indexed must be explicitly defined, because the default is to automatically index the
Similarly, for a string type of field, the analysis needs to be explicitly defined, because the default is also an analysis
It is important to choose a regular ID, and an ID that is too random (such as a Java UUID) is not conducive to querying

On the last point, the individual believes that there are several factors:

One (perhaps not the most important) factor: the above-mentioned compression algorithm, is the posting list in the large number of ID compression, if the ID is sequential, or there is a common prefix, such as a certain regularity of the ID, compression ratio will be relatively high;

Another factor: probably the most affecting query performance, should be the last through the posting list ID to the disk to find the document information that step, because Elasticsearch is segment storage, Based on the ID of this large range of term positioning to the segment efficiency directly affects the performance of the last query, if the ID is regular, you can quickly skip the segment that does not contain the ID, thereby reducing unnecessary disk read times , Refer to this article how to choose an efficient global ID scheme (reviews are wonderful)

This article is very good: https://neway6655.github.io/elasticsearch/2015/09/11/elasticsearch-study-notes.html#section-1

Lucene underlying data structure-the underlying filter Bitset principle, time series data compression compresses the same data to a single line

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More