LSM Tree Storage model

Source: Internet
Author: User

LSM Tree Storage model

The database has 3 basic storage engines:

    • Hash table, support for increment, delete, change and random read operations, but does not support sequential scanning, the corresponding storage system for the Key-value storage System. For Key-value inserts and queries, the complexity of the hash table is O (1), significantly faster than the tree operation O (n), if you do not need an orderly traversal of data, hash table is a good choice;
    • B + Tree, supports a single record of the increment, delete, read, change operation, also support sequential scanning (b + tree leaf node between the pointer), the corresponding storage system is the relational database (MySQL, etc.).
    • The LSM Tree (log-structured), the LSM tree and the B-tree, also supports increment, delete, read, change, sequential scan operations, and avoids disk random write problems through bulk storage technology. Of course, there are pros and cons, LSM tree and B + tree, compared to the LSM tree atthe expense of partial read performance, to significantly improve write performance, based on the LSM tree database such as LEVELDB, HBase and so on.

Lsm-tree's design idea is very simple: the changes to the data will be kept in memory, the specified size limit to write these modifications to disk bulk, but a little trouble reading, you need to merge the history of the disk and the most recent changes in memory, so write performance greatly improved, You may need to read the memory first, or you will need to access more disk files. Therefore, the lsm-tree is suitable for the application scenario is: The size of the insert data, the amount of read data and update data is not high and read generally for the latest data.

The LSM tree splits a big tree into n small trees, which are first written into memory, and as the trees grow larger and smaller, the small trees in memory are flush to disk, and the trees on the disk can be merged into a tree on a regular basis to optimize read performance.

The data is first inserted into the in-memory tree. A merge operation occurs when the data in the in-memory tree exceeds a certain threshold. The merge operation iterates the leaf nodes of the in-memory tree from left to right and merges the leaf nodes of the tree in the disk, and when the amount of data being merged reaches the size of the disk's storage page, the merged data is persisted to disk, and the parent node's pointer to the leaf node is updated.

Once the leaf nodes that existed on the disk were merged, the old data is not deleted, and the data is copied to the disk in sequential order with the in-memory data. This can be a waste of some space, but Lsm-tree provides mechanisms to reclaim these spaces.
The non-leaf node data of the tree in the disk is also cached in memory.
Data lookup finds the in-memory tree first, and instead finds the tree on the disk if no results are found.

One obvious problem is that if the amount of data is too large, the trees in the disk will be correspondingly large, resulting in slower merging. One solution is to build trees of all levels, and lower-level trees are larger than the tree datasets in the previous hierarchy. Assuming that the tree in memory is C0, the tree in the disk follows the hierarchy once for C1, C2, C3, ... ck-1, CK. The order of merging is (C0, C1), (c1, C2) ... (Ck-1, CK).

Why Lsm-tree is inserted quickly:

    • The insert operation will first be used for memory, and the tree in memory will not be large, which will be quick;
    • Merge operations write to one or more disk pages sequentially, which is much faster than random writes;

Reference Documentation:

Http://www.2cto.com/database/201411/350877.html

LSM Tree Storage model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.