Hamsterdbvs. LevelDB

Last Update:2018-06-03 Source: Internet

Author: User

Tags valgrind value store

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

[Editor's note] although it has been around for nine years, Hamsterdb still lacks popularity compared with MongoDB, and is rated as a non-mainstream database. Hamsterdb is an open-source key-value database. However, unlike other NoSQL databases, Hamsterdb is single-line and non-distributed, and its feature design is more like a column-store database. It also supports read-committed isolation-level ACID transactions. What are the advantages of Hamsterdb compared with LevelDB? Here we will share with Christoph Rupp, one of the project participants.

The following is a translation:

In this article, I would like to introduce Hamsterdb, an embedded analyticdb Based on Apache 2-licensed protocol, similar to Google's LevelDB and Oracle's BerkeleyDB.

Hamsterdb is not a new competitor. In fact, Hamsterdb has been around for nine years. This time, it grew fast and focused on Database Analysis Technology for key-value storage, similar to column-store databases.

Hamsterdb is single-line and non-distributed and can be directly connected to user applications. Hamsterdb provides a unique transaction implementation and features similar to column-store databases, which is very suitable for analytical workloads. It can be called by C/C ++ native and is also oriented to programming languages such as Erlang, Python, Java, NET, and even Ada. At the same time, it has been deployed on tens of millions of embedded devices and pre-applications, as well as on the cloud-such as high-speed cache and indexing-and has already been deployed in millions.

Hamsterdb has a unique feature in key-value storage: It recognizes architecture information.. Although most databases cannot analyze or focus on the inserted key type, hamsterdb supports two types of key values: binary key (Fixed Length. variable Length) and numerical key (such as uint32, uint64, real32, and real64 ).

Hamsterdb databases are also Btree indexes stored in files or storage. Using Btree makes hamsterdb's analysis capabilities powerful. The C ++ module is applied to the Btree index. The parameters of this module depend on the key type and log size (Fixed Length. variable Length), is irrelevant to whether the key is repeated, so each B-Tree node is highly available for the workload. Because of the Fixed Length of keys, each key is load-free and the keys are arranged as simple arrays. At the bottom layer of the focused index, the uint64 key database supports the uint64_t Type C array.

This implementation reduces I/O and makes more effective use of the CPU cache. The current CPU needs to optimize the memory performance, which is also a major advantage of Hamsterdb. For example, when a leaf node is searched, binary search is skipped when the available memory reaches a certain threshold. Instead, linear search is used. In addition, hamsterdb has APIs equivalent to SQL commands COUNT, COUNT DISTINCT, SUM, and AVERAGE. Given that it runs directly on the Btree, it can quickly work on a fixed-length key.

Hamsterdb also supports variable-length keys.Therefore, each B-Tree node has a very small index pointing to the node's effective load in advance. This may cause adjustment of the existing key length or reorganization after deletion. Therefore, you must "vacuumized" the node to avoid wasting space. This operation will become a performance killer and faces huge challenges in improving the speed, so it can be used less.

Hamsterdb allows replica keysThis means that a key may point to multiple records, and all records of the key will be organized together. They can be used to process the index structure of variable-length keys. (If a key has many backup records, it will be deleted from the B tree and stored in an independent overflow zone)

Hamsterdb supports read-committed isolation-level ACID transactions.Transaction updates can be stored in memory as delta-operations. Each database has an independent transaction index, and the updates in these transactions have a higher priority than those in the BTree. Suspending a transaction only means giving up the update of the transaction in the transaction index and handing over the transaction update to the Btree.

Unique design options bring powerful advantages. The transaction upgrade is left in RAM rather than requesting I/O. The transaction termination logic is no longer required because the transaction will not continue to be executed once it is terminated. The recovery logic uses a simple logical log, but there is also an important challenge: At runtime, the two trees must be merged. Imagine using the database cursor to complete a full scan. This result is very complicated. Some keys exist in the B tree and some in the transaction tree. In the transaction tree, keys in the B tree can be overwritten or deleted, or even modified by other keys. Therefore, this is a problem when multiple keys are involved.

The most powerful feature of Hamsterdb is testability.The basic principle of databases is that data cannot be lost, which is more important than performance. All critical buckets can be solved. In addition, during nine years of development, in order to solve the technical debt problem, the poor design that appears under specific circumstances is basically eliminated, I am responding to user needs and new ideas in an agile and rapid manner. I have been rewriting some code or trying new ideas. High test coverage gives me a lot of confidence because my changes won't disrupt anything.

Focusing on testability and high automation enables me to handle many things. At worst, Hamsterdb debug is flooded with a large number of assert and integrity checks, with approximately 1800 unit tests and 35000 acceptance tests. The acceptance tests run dozens of different structures and are executed in parallel in BerkeleyDB. We will continue to check the data consistency between the two databases, so any new bugs will be immediately displayed. In addition, a detailed list is provided for each test, including memory consumption, heap allocation quantity, number of allocated pages, Blobs (Binary Large Object Storage), and B-Tree Splitting and merging.

Valgrind can be used in some tests. We will compare the performance before and after valgrind, so as to quickly locate the problem and fix the performance.

In addition, testing simulated database crashes can test the recoverability of hamsterdb. Last but not least, I can use QuviQ QuickCheck, a performance detection tool based on the Erlang language. QuickCheck lets you know the performance of the software, and then runs the pseudo-randomized command to continuously check the integrity.

Static code analysis can be used with Coverity's open-source product and clang's scan-build tool. They can find some minor issues.

Before release, all tests were fully automated and high-performance. A complete release cycle usually takes several days, and a hard disk is used every two months.

To sum up what I learned, testing and writing will be very interesting. Iterative development without reliability testing cannot be simplified.

I will also introduce hamsterdb pro, a commercial version of Hamsterdb that provides severe compression for keys, records, and logs (zlib, snappy, lzf, and lzo ), as well as AES encryption and SIMD Optimization for leaf node search. There are more compression algorithms (bitmap compression and prefix compression) in progress or in progress. The webpage contains more information.

So far it has been good, but what is the performance of hamsterdb? I used Google's benchmark test to compare the performance of Hamsterdb 2.1.8 with LevelDB 1.15. Compression disabled (Hamsterdb does not provide compression yet, But Hamsterdb pro does ). Fsyncs is also disabled. It is a restoration feature of hamsterdb (implemented through pre-write logs ). The test size range is from a small key/record to a medium-size key and a large record, and data is inserted, ranging from kb to MB. In addition, I ran two Hamsterdb analysis functions, as well as LevelDB. The cache size for all test runs ranges from 4 MB to 1 GB. The machine is equipped with an HDD and an SSD.

Hamsterdb is always configured based on the fixed-length key-the 8-byte key-The uint64 numbers stored in hamsterdb. Since LevelDB needs to convert number to string, this has become one of hamsterdb's advantages.

I also added a small record (size 8) test because they are usually used for secondary indexing when they contain primary keys. Two machines use different hard disks: HDD (Core i7-950 8-Core and 8 Mb cache) and an SSD (Core i5-3550 4-Core and 8 Mb cache). Below are some benchmark results, for more information, see here.

Continuous write; key size: 16; Log Size: 100 (HDD, 1 GB cache)

Continuous read; key size: 16; Log Size: 100 (HDD, 1 GB cache)

Random write; key size: 16; Log Size: 100 (HDD, 1 GB cache)

Random read; key size: 16; Log Size: 100 (HDD, 1 GB cache)

Calculate the synthesis of all keys (HDD, 4 MB cache)

The calculation end is the "77" Key (SSD, 1 GB cache)

For random reads, Hamsterdb has better performance than LevelDB. For random writes, Hamsterdb is faster than LevelDB as long as the data volume is not too large. From more than keys, hamsterdb will suffer from the traditional BTree database problem: a large number of non-sequential I/O High Disk seek latency.

Even so, this test proves the analysis capability of Hamsterdb. In particular, sum and count operations can be well extended. Continuous insertion and scanning are also highlights of Hamsterdb, and it can be very fast regardless of the amount of data.

Future work

This benchmark found many problems: optimizing random read/write through parallel hamsterdb. This will become a major part of my work, and I have drafted a design method and restructured it before the product is released.

Hamsterdb: An Analytical Embedded Key-Value Store

Subscribe to the "CSDN cloud computing (left) and CSDN big data (right)" Public Accounts free of charge to learn first-hand cloud messages in real time,Learn about the latest Big Data progress!

CSDN releases cloud computing information such as virtualization, Docker, OpenStack, CloudStack, and data center, share Big Data ideas such as Hadoop, Spark, NoSQL/NewSQL, HBase, Impala, memory computing, streamcompute, machine learning, and smart algorithms, provides cloud computing and big data technologies, platforms, practices, industrial information, and other services.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More