Hamsterdb vs. Leveldb: And look at the confessions and attacks of Non-mainstream databases

Source: Internet
Author: User
Keywords Big data open source hamsterdb leveldb

Although the "editor's note" has been available for 9 years, the popularity of Mongodb,hamsterdb is still lacking, and it has been rated as a Non-mainstream database. Hamsterdb is an open source key value type database. However, unlike other Nosql,hamsterdb, which are single-threaded and not distributed, they are designed to be more like a column store database, while also supporting acid transactions at the Read-committed isolation level. So what's the advantage of contrasting leveldb,hamsterdb, here we go into one of the project participants Christoph Rupp share.

The following is the translation:

In this article, I would like to introduce a hamsterdb--based on the Apache 2-licensed protocol embedded Analytical Key-value database, similar to Google's Leveldb and Oracle's BerkeleyDB.

Hamsterdb is not a new competitor. In fact, Hamsterdb has been in existence for 9 years. This time, it grows fast and focuses on database analysis techniques for key-value storage, similar to a column-storage database.

Hamsterdb are single-threaded and not distributed, and can be directly connected to user applications. HAMSTERDB provides a unique transaction implementation and features similar to the column storage database, which is ideal for analytical workloads. It can be invoked by A/C + + native and is also intended for Erlang, Python, Java, NET, and even the ADA programming languages. It also has tens of millions of deployments in embedded devices and predecessors, as well as services to the cloud, such as caching and indexing.

Hamsterdb has a unique function in key-value storage: It recognizes schema information. Although most databases cannot parse or focus on the inserted key type, HAMSTERDB supports two types of key values: Binary key (fixed length vs. variable length) and numerical key key (such as UInt32, UInt64, Real32, REAL64).

A HAMSTERDB database is also a btree index stored in a file or memory. The use of Btree makes HAMSTERDB's ability to analyze become powerful. The Btree index applies a C + + module that depends on the key type and the size of the log (fixed length vs. variable length), regardless of whether the key is duplicated, so that each btree node is highly available for the workload. Because the key is fixed, each key is 0 loaded and the key is arranged like a simple array. At the bottom of the focus index, the UInt64 key database supports the uint64_t type of C array.

This implementation reduces I/O and leverages the CPU cache more efficiently. Today's CPU needs to optimize memory performance, which is a major advantage of HAMSTERDB. For example, by searching for leaf nodes, binary searches are skipped when available memory reaches a certain threshold, and are replaced by linear searches. Also, Hamsterdb has APIs that are equivalent to SQL commands count, Count DISTINCT, SUM, and average, allowing it to work quickly on fixed-length keys, given that it runs directly on the btree.

HAMSTERDB also supports variable-length keys. Therefore, each btree node has a very small index that points ahead to the payload of the node. This may result in an existing key length adjustment or reorganization after deletion, so you must do a "vacuumized" operation on the node to avoid wasting space. This operation will become a performance killer, facing a huge challenge in speed improvement, so it can be used sparingly.

HAMSTERDB allows the replica key, which means that a key may point to more than one record, and all records of the key are organized together. They can be used to process index structures for variable-length keys. (If a key has many backup records, they will be removed from the btree and stored in a separate overflow area)

Hamsterdb supports acid transactions at the Read-committed isolation level. Transaction updates can be stored as delta-operations in storage. Each database has a separate transaction index, and the updates in these transactions have a higher priority than the btree. Aborting a transaction only means discarding the update of the transaction in the transaction index and handing the transaction update to Btree.

Unique design choices bring strong advantages. Transaction upgrades remain in RAM instead of requesting I/O. The transaction termination logic is no longer required because once the transaction is terminated it does not continue. The recovery logic uses a simple logical log, but there is also an important challenge: at runtime, two of trees must be merged. Imagine using database cursor to complete a full scan, and the results are very complex. Some keys exist in Btree, some in the transaction tree. In the transaction tree, keys in the btree can be overridden or deleted, or even modified by other keys. So this is very disturbing when it comes to multiple keys.

The most powerful feature of Hamsterdb is testability. The basic rule of the database is that data cannot be lost, which is particularly important than performance. Critical bugs can be solved. In addition, in the nine-year development process, in order to solve the problem of technical debt, those in the specific circumstances of the poor design is basically removed, is in an agile, quick response to respond to the user's needs and new ideas, I have been rewriting part of the code or try new ideas. High-test coverage gives me great confidence because my changes will not destroy anything.

Focusing on testability and high automation allows me to handle a lot of things. At worst, Hamsterdb debug is flooded with assert and integrity checks, with approximately 1800 unit tests and 35,000 acceptance tests. Those acceptance tests run dozens of different structures and execute them in parallel in BerkeleyDB. We will continue to check the consistency of the data in the two databases, so any new bugs will be displayed immediately. In addition, each test gives a detailed list of details, including memory consumption, heap allocations, number of allocated pages, Blobs (binary large object storage), Btree splitting and merging, and so on.

Some tests can use Valgrind. We will compare the performance of Valgrind before and after use, so as to quickly find out where the problem occurred, and do performance repair.

In addition, you can test the recoverability of HAMSTERDB by testing the simulated database crash. Last but not least, I can use Quviq's QuickCheck, a performance detection tool based on Erlang language. QuickCheck lets you know the performance of the software, and then runs the pseudo-randomized instructions, constantly verifying completeness.

Static code analysis can be used with coverity open source products and clang scan-build tools. They can find some minutiae.

All tests are fully automated and high-performance prior to release. A full release cycle usually takes a few days, and a hard drive is consumed every two months.

Summing up the knowledge I have learned, test writing will be a very interesting thing. Iterative development without reliability testing cannot be simplified.

Let me also introduce the commercial version of Hamsterdb, Hamsterdb Pro, which provides heavy compression (zlib, snappy, LZF, and Lzo) for keys, records, and logs, as well as AES encryption and SIMD optimizations for leaf-node lookups. More compression algorithms (bitmap compression and prefix compression) are in progress or planned. More information on the Web page.

So far so good, but what about the performance of Hamsterdb? I used Google's benchmark to compare the performance of Hamsterdb 2.1.8 with Leveldb 1.15. Compression is disabled (HAMSTERDB is not available for compression, but Hamsterdb Pro is available). Fsyncs is also prohibited, and it is a repair feature of HAMSTERDB (implemented by the pre write log). The test size ranges from a smaller key/record to a medium size key and a larger record, and inserts data ranging from 100K to 100M levels. In addition, I ran two hamsterdb analysis functions, leveldb. All test runs have a cache size ranging from 4MB to 1GB, with an HDD and an SSD on the machine.

The HAMSTERDB configuration is always based on a fixed-length key--a hamsterdb stored uint64 numbers for a 8-byte key. This has become one of the advantages of HAMSTERDB since LEVELDB needs to convert number to string.

I have also added tests for smaller records (size 8) because they are usually used for secondary indexes when they contain primary keys. Two machines used different hard disks: HDD (core i7-950 8 core and 8MB cache) and one SSD (core i5-3550 4 core and 8MB cache), below are some of the benchmark results, which can be seen here.

Persistent write; key size: 16; Log size: hdd,1 GB Cache


continuous reading; key size: 16; Log size: HDD,1GB cache


Random write; key size: 16; Log size: HDD,1GB cache


Random reads; key size: 16; Log size: HDD,1GB cache


Calculates the synthesis of all keys (HDD,4MB cache)


The key that evaluates to "77" (SSD,1GB Cache)


For random reads, HAMSTERDB performance is better than LEVELDB. For random writing, as long as the amount of data is not too large, hamsterdb faster than LEVELDB. From the beginning of the 10 million key, Hamsterdb suffers from the traditional problem of the Btree database: a large number of non sequential continuous I/O high disk seek delays.

That said, the test is a good proof of Hamsterdb's analytical power. In particular, the sum and count operations can be extended nicely. Continuous insertion and scanning is also a bright spot for Hamsterdb, and it can be very fast, no matter how large the amount of data.

Future work

This benchmark test has revealed a number of problems: optimizing random Read/write by parallel Hamsterdb. This is going to be a major part of my job, and I've sketched out a design methodology and refactoring before the release of the product.

Original link: hamsterdb:an analytical Embedded key-value Store (translation/Dongyang Zebian/Zhonghao)

Free Subscription "CSDN cloud Computing (left) and csdn large data (right)" micro-letter public number, real-time grasp of first-hand cloud news, to understand the latest big data progress!

CSDN publishes related cloud computing information, such as virtualization, Docker, OpenStack, Cloudstack, and data centers, sharing Hadoop, Spark, Nosql/newsql, HBase, Impala, memory calculations, stream computing, Machine learning and intelligent algorithms and other related large data views, providing cloud computing and large data technology, platform, practice and industry information services.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.