Leveldb source code reading

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I read some leveldb code and have the following feelings:

1. be careful about resource control: In level dB, the number of memory and files is regarded as a resource. You need to determine whether there are sufficient resources at any write operation, for example, whether the memory exceeds the limit value, if the number of files is too large, so that continuous and high-speed writing will lead to frequent Io reading operations, and affect performance. If any indicator reaches the threshold, throttling starts, for example, every write of delay 1 ms.

2. The file compaction startup standard provides a quantitative indicator:

// We arrange to automatically compact this file after // a certain number of seeks. let's assume: // (1) One seek costs 10 ms // (2) writing or reading 1 MB costs 10 ms (100 Mb/s) // (3) A compaction of 1 MB does 25 MB of IO: // 1 MB read from this level // 10-12 Mb read from next level (boundaries may be misaligned) // 10-12 Mb written to next level // This implies that 25 seeks cost the same as the compaction // of 1 MB of data. i. E ., one seek costs approximately the // same as the compaction of 40kb of data. we are a little // conservative and allow approximately one seek for every 16kb // of data before triggering a compaction.

The author thinks that the approximate cost of a seek can be equivalent to 16 KB of compaction data, so if the size of a file is X, the cost of seek x/16 times is the same as compaction of the file, and the compaction of the file can be started at this time.

3. memfile ref count ++ is used for each access, which saves the memory copy overhead. However, if the read QPS is high, the memtable in the memory may never be released, this may cause excessive memory usage and block write at last, which is not very reasonable.

4. The multi-channel merge in iterator uses simple array lookup and does not use Priority Queues. Maybe the author thinks this dB is not prepared for large-scale data access?

5. its merge method is also quite interesting, dividing the file into multiple levels, except for level0, there is no overlap (http://leveldb.googlecode.com/svn/trunk/doc/impl.html) between the files in other levels, I think this adds complexity, it is unknown how much better the performance will be.

6. blockindex keys may not exist. For example, if the end keys of the two blocks are I am Raymond and I see youha, the keys in the block index can be stored as I B, because I B> I am Raymond and I B <I see youha, this reduces the storage space of blockindex.

7. As long as it is possible to always use variable-length encoding and compression, this should be Google's strength.

8. the block cache stores the file information at the same time. When the file is removed, the block of the file is also removed. This effect does not affect the performance, but if the memory is very limited, it may be meaningful to increase or delete the number of files at a constant speed.

Overall, level dB is suitable for small-scale storage and can be used as a berkeleydb. It should be advantageous for mobile development platforms and may not be suitable for building large-scale data platforms.

In addition, I feel that some implementations of leveldb are too fine. Maybe Google has encountered some problems that we rarely encounter.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Leveldb source code reading

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support