Comparison of three basic storage engines

Source: Internet
Author: User
1. Hash Storage engine represents databases: redis, memcache, and so on are also common in the search speed Optimization of other storage engines. Because of the particularity of the Hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-Tree indexes that need to go from the root node to the branch node, the page node can be accessed for multiple I/O accesses.

1. Hash Storage engine represents databases: redis, memcache, and so on are also common in the search speed Optimization of other storage engines. Because of the particularity of the Hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-Tree indexes that need to go from the root node to the branch node, the page node can be accessed for multiple I/O accesses.

1. Hash Storage Engine

Representative databases: redis, memcache, etc.

It is also common in the search speed Optimization of other storage engines. Because of the particularity of the Hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-Tree indexes that need to go from the root node to the branch node, the Hash index query efficiency is much higher than that of B-Tree indexes. Although the Hash index is highly efficient, the Hash index has many restrictions and drawbacks due to its particularity.

Disadvantages are listed here:

(1) The Hash index only supports "=", "IN" and "<=>" queries, and does not support range queries.
(2) Hash indexes cannot be used to avoid data sorting.
(3) Hash indexes cannot be queried using some index keys.
(4) Hash indexes cannot avoid table scanning at any time.
Hash collision means chain scan:
Because different index keys have the same Hash value, the query cannot be completed directly from the Hash index even if the number of records that meet the Hash key value is obtained, you still need to compare the actual data in the Access Table and obtain the corresponding results.
(5) When the Hash index encounters a large number of equal Hash values, the performance is not necessarily higher than the B-Tree index.

2. B-tree storage engine

Representative database:MongoDB and mysql (relational databases)

There is also a B-tree storage engine: COLA tree (cacheobliviusbtree)

Representative Database: tokudb

In order to make the B-tree more effective, they proposed a cache forgetting CacheOblivious algorithm, which does not need to explicitly know the data transmission scale in the memory layer, it can also work efficiently. For more information, see http://en.wikipedia.org/wiki/cache-oblivius_algorithm.

A name that everyone is familiarTokuMX: At present, the very popular NoSQL database MongoDB replaces the underlying layer with the same storage engine [ToKuMx] As TokuDB, achieving very good results

3. LSM Tree (Log-Structured Merge Tree) storage engine

Representative databases: nessDB, leveldb, hbase, etc.

The core of the core idea is to discard part of the read capability in exchange for the maximum write capability. The concept of LSM Tree is the meaning of the structured merged Tree. Its core idea is actually very simple. It is assumed that the memory is large enough, therefore, you do not need to write data into the disk every time you update the data. Instead, you can first store the latest data in the disk and wait until the last amount of data is accumulated, merge and sort the data in the memory to append the data to the end of the disk (because all the trees to be sorted are ordered, they can be quickly merged by merging and sorting ).

The LSM-tree of log structure is a hard disk-based data structure. Compared with B-tree, it can significantly reduce the overhead of the hard disk arm, it also provides high-speed file insertion (deletion) over a long period of time ). However, the performance of LSM-tree is poor in some situations, especially when queries require fast response. Generally, LSM-tree is applicable to application systems with more frequent index insertion than retrieval. Bigtable uses GFS to store logs and SSTable when providing the Tablet service. GFS is designed to modify files by adding new data instead of overwriting old data. LSM-tree delays and performs Index Update in batches by means of rolling merge and multi-page partitioning, making full use of memory to store recent or frequently used data to reduce the search cost, use hard disks to store infrequently used data to reduce storage costs.

Technical Features of a disk: For a disk, You can maximize the usage of the disk technical features by reading or writing a fixed data block at a time, and minimize the number of random seek operations.

The difference between LSM and Btree requires the consideration of the reading and writing performance. At the sacrifice of colleagues, find other solutions to make up for it.

1,LSM features batch storage latency.When the write/read ratio is large (more than the write ratio), the LSM tree has better performance than the B tree. Because with the insert operation, nodes are split to maintain the B tree structure. The random read/write probability of a disk increases, and the performance degrades. Multiple Random writes on a single page are converted into a random write on multiple pages, reusing the disk seek time, greatly improving efficiency.

2,Writing Process of Tree B:The write process to tree B is an in-situ write process, which is mainly divided into two parts: First, find the corresponding block location, then, write the new data to the data block you just found, then find the physical location of the disk corresponding to the block, and write the data. Of course, when the memory is sufficient, because part of the B-tree can be cached in the memory, the block search process may be completed in the memory, but in order to express clearly, let's assume that the memory is very small and only enough data of the size of the B-tree block can be stored. We can see that in the above mode, two random seek operations (one query, one in-situ write) are required to write data at a high cost.

3,LSM TreeGiven up the disk read performance in exchange for the write sequence, it seems that reading should be the most guaranteed feature of most systems. Therefore, it seems that it is not a good deal to use read-for-write. But don't worry. Let me analyze it.

A. The memory speed is much faster than that of the disk, more than 1000 times. The read performance improvement mainly depends on the memory hit rate rather than the number of disk reads.

B. Write operations that do not occupy disk I/O, read operations can obtain the disk I/O usage for a longer period of time, which can also improve Read efficiency. For example, although the SSTable of LevelDb reduces the read performance, if the read hit rate of data is guaranteed, the read can obtain more disk I/O opportunities, therefore, the read performance is basically not reduced, and may even be improved. The write performance will be greatly improved, basically 5 ~ About 10 times.

The following is a detailed example:

LSM TreeWe have made many small ordered structures, such as sorting every m of data once in the memory, and sorting the following 100 data records again ...... In this way, I can obtain N/m ordered small ordered structures.

During the query, because I don't know where the data is, I will perform a binary search from the latest small ordered structure. If I find the data, I will return it, if the structure cannot be found, continue to find the next small ordered structure until it is found.

It is easy to see that in this mode, the read time complexity is (N/m) * log2N. Reading efficiency will decrease.

This is the most original LSM tree idea. In this way, the performance is still relatively slow, so we need to do something to improve. How can we do it?

LSM Tree optimization methods:

A. Bloom filter: it is a bitmap with probability immediately. It can quickly tell you whether there is specified data in a small ordered structure. Therefore, you do not need to perform binary search. Instead, you only need to calculate several times to know whether the data is in a small set. The efficiency has been improved, but the cost is limited.

B. compact: merge a small tree into a big tree: because the performance of a small tree is faulty, a process is required to constantly merge the small tree into the big tree.Old DataQuery can also be directly found using log2N. You do not need to perform (N/m) * log2n query.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.