B-Tree
B-Tree, here is balance (balanced meaning), B-Tree is a multi-path self-balancing search tree It resembles a normal balanced binary tree, the difference is that the B-tree allows each node to have more child nodes. is a simplified map of B-trees.
B-Tree has the following characteristics:
All key values are distributed throughout the tree;
Any keyword appears and appears only in one node;
Search may end at non-leaf nodes;
Do a search within the complete keyword, performance approximation binary search;
B + Tree
B + Tree is a variant of B-tree and is also a multi-path search tree, and it differs from that of a tree:
All keywords are stored on leaf nodes, internal nodes (non-leaf nodes do not store real data)
Added a chain pointer to all leaf nodes.
Simplify B + trees such as
Why use b-/b+ Tree
Data structures such as red and black trees can also be used to implement indexes, but file systems and database systems generally use B-/+tree as the index structure. MySQL is a disk-based database system, the index is often stored in the form of an index file on the disk, the index lookup process will generate disk I/O consumption, relative to memory access, the consumption of I/O access is several orders of magnitude, the structure of the index to minimize the number of disk I/O access during the lookup process. Why using B-/+tree is also related to the principle of disk access.
Principle of locality and disk pre-reading
Because of the gap between the disk's access speed and memory, in order to improve efficiency, disk I/O is minimized. The disk is often not read strictly on-demand, but every time it is read-ahead, the disk reads the required data, sequentially reading the data into memory in a certain length. The rationale behind this is the well-known local principle in computer science:
当一个数据被用到时,其附近的数据也通常会马上被使用程序运行期间所需要的数据通常比较集中
Due to the high efficiency of disk sequential reads (no seek time required and minimal rotational time), pre-reading can improve I/O efficiency for programs with locality. The length of the read-ahead is generally the integer multiple of the page.
MySQL (using the InnoDB engine by default), records are managed by page, with a default size of 16K per page (this value can be modified). Linux default page size is 4K
Performance analysis of B-/+tree indexes
The practical implementation of B-tree also requires the use of the following techniques:
Each time you create a new node, request a page space directly, so that a node is physically stored in a page, in addition to the computer storage allocation is page-aligned, the implementation of a node only one time I/O.
Assuming that the height of the b-tree is H,b-tree, a maximum of h-1 I/O (root node resident memory) is required for one retrieval, and the progressive complexity is O (h) =o (LOGDN) O (h) =o (LOGDN). In general practice, the out-of-size D is a very large number, usually more than 100, so H is very small (usually not more than 3).
And the red-black tree structure, H is obviously much deeper. Because the logically close node (parent-child) may be far away physically, it is not possible to take advantage of locality, so the I/O asymptotic complexity of the red-black tree is also O (h), and the efficiency is significantly worse than B-tree.
Why use B + trees
B + Tree is more suitable for external storage, because the inner node has no data domain, a node can store more internal nodes, each nodes can be indexed larger and more accurate, also means that B + tree single disk IO more information than B-tree, I/O efficiency is higher.
MySQL is a relational database, interval access is a common situation, B + leaf node added chain pointers, enhanced interval access, can be used in the range of query, etc., and B-tree each node key and data together, it is impossible to find the interval.
MySQL index structure--Viewed by b-/b+ tree