Tair LDB Prefixkey-based range Lookup performance optimization project proposal scheme

Source: Internet
Author: User

Based on the filtering idea of prefix bloomfilter and the characteristics of the Get_range interface data, under the guidance of the tutor, the following simple scheme is proposed to optimize the range finding process of the Get_range interface, which makes it possible to filter by prefix and reduce invalid disk IO.

interface to be optimized

int get_range (int area, const data_entry &pkey, const data_entry &start_key,       const data_entry &end_key, I NT offset, int limit, vector<data_entry*>     &values,shorttype=cmd_range_all);


Proposed programme

1. Since the data of the Get_range interface comes in from the PREFIX_PUT/PREFIX_INCR interface, the length information of the prefix can be obtained from their pkey parameters, Pkey data type is data_entry, with attributes Prefix_ Size, then we merge Pkey and skey into Mkey (the size of the prefix_size of Pkey that has been set by the client) and transfer it to the server side with value.

In the client-server connection process, the type of key is encapsulated into the Ldbkey class, the type of value is encapsulated into the Ldbitem class, Ldbitem contains the key prefix_size information, Both are then converted to slice types sent to the LEVELDB for storage operations. Note that value contains Prefix_szie information (serialization information, which cannot be extracted directly), so we can extract it from value when we generate the filter block Prefix_ The size information (parsed and extracted in ldbitem format) to generate the prefix bloomfilter we need. The specific implementation of the extraction can be placed outside the LEVELDB layer, in the leveldb inside the call (Detach operation).

2. After extracting the prefix_size information, we implement prefix bloomfilter for all keys, in order to achieve simplicity, we can add the new prefix Bloomfilter in the original filter block, Stored with the existing Bloomfilter, the method is also very simple, is in the original filter block building process (filter_ block.cc), the prefix of the key is also added as a common key in the filter, the filtering algorithm is also consistent.

Note: The current creation of the filter will increase the BITS_PER_KEY_ bit per addition of a key,filter, but because part of the keys prefix is the same, it will not be added repeatedly to the filter, So eventually the added bits should be acceptable.

3. In the Get_range interface, if you find the sstable here (check memtable and immutable memtable, both do not have Disk IO operations),

(1) First look for possible sstfiles based on the range of [Pkey+skey,pkey+end].

(2) For each file, the information in the Dataindex block continues to be scoped to find the blocks that may contain the range of [Pkey+skey,pkey+end].

(3) Before reading each block, get the filter stored in the filter block, by prefix the Maymatch method to determine whether the block contains a prefix pkey, if not included, skip the block directly, so that through the prefix The Bloomfilter implements block filtering, which reduces unnecessary disk IO operations.


The current implementation of the Get_range interface can be referenced by:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.