File storage B + Tree

Source: Internet
Author: User

File storage to choose a data structure such as B + Tree

"File storage to choose a data structure like B + Tree"--remember correctly, this is a Min of the data structure book. I do not know whether I did not look closely or she did not say, anyway, at that time purely test-oriented to remember such a conclusion.
Superficial understanding is not a good learning attitude after all, I have not thought about this matter, until I saw this blog post http://blog.csdn.net/v_JULY_v/article/details/6530142.

This is a great amount of information and deserves Mark's slow reading. Let's take a moment to write about disk file storage Choose B + Tree This is a question that has not been delved into before. After all, good memory is not as bad as the written, although this article inside Ctrl-v as a relatively many tasks ...

Another interesting harvest is finally knowing that there is no B minus tree. I used to see the B-tree, thinking that it corresponds to a C + tree. But the reality is:

B-Tree, that is, tree. Because the original English name of B-Tree is b-tree, and many people in China like to b-tree translation B-tree, in fact, this is a very bad literal translation, it is easy to make people misunderstand. As one might think of a B-tree as a tree, and a tree of trees. In fact,B-tree refers to the B-tree .

The following:

Structure of the disk

The disk is a flat disc (similar to a phonograph record). There are many circles on the disk called tracks, and the data is recorded on these tracks. The disk can be either monolithic or a set of disks consisting of several platters with two faces on each platter. As an example of the 6-disc group shown in 11.3, a total of 10 faces can be used to hold information except for the top and bottom outer sides without storing the data.

When the disk drive is performing read/write functions. The disc is mounted on a spindle and rotates at high speed around the spindle, and when the track is passed under the Read/write head (also called the head), the data can be read/written.

The general disk is divided into fixed head (head fixed) and movable head disc. Each track of the fixed head has a separate head, which is fixed and is specifically responsible for reading/writing the data on this track.

The heads of the active head disks (e.g.) are movable. There is only one head on each disc (the head is bidirectional, so both the front and the sides can read and write). It can move from one track to another on that side of the track. All heads are mounted on the same boom, so all heads on different discs are moved simultaneously (action uniform). When the disc rotates around the spindle, the head and the rotating platter form a cylinder. The track with the same radius on each disc makes up a cylindrical face, which we call a cylinder. Therefore, the number of cylinders is that of the track on the disk.

the disk reads / writing principle and efficiency

The data on the disk must be uniquely marked with a three-dimensional address: The cylinder number, the plate number, the block number (the disk block on the track).

The following 3 steps are required to read/write a specified data on a disk:

(1) The first move arm moves the head to the desired cylinder according to the cylinder number, which is called positioning or locating.

(2) in the 6-disk group shown in 11.3, all the heads are positioned on the 10 tracks of 10 disks (the heads are bidirectional). The track on the specified disc is determined according to the disk face number.

(3) After the disc is determined, the disc begins to rotate and the track segment of the specified block number is moved to the head.

After the above three steps, the specified data storage location is found. You are now ready to start reading/writing operations.

Access to a specific information, consisting of 3 parts of time:

Seek time Ts: The time required to complete the above steps (1). This part of the time is the most expensive, the maximum can reach about 0.1s.

Wait (latency time) Tl: The time required to complete the above steps (3). As the disc revolves around the spindle speed is very fast, generally 7200 rpm (computer hard disk performance indicators, the average domestic hard disk speed is generally 5400rpm (notebook), 7200rpm of several). Therefore generally rotates around 0.0083s.

Transmission time (transmission times) Tt: The time that data is transmitted through the system bus to memory, typically transmitted in one byte (byte) approximately 0.02us=2*10^ ( -8) s

Disk read data is based on disk blocks (block) as the basic unit. All data in the same block can be read all at once. The cost of disk IO is mainly spent on the lookup time TS. Therefore, we should try to keep the relevant information in the same disk block, the same track. Or at least on the same cylinder or adjacent cylinder, in order to read/write information to minimize the number of head-to-back movement, avoid too much time to find TS.

Therefore, in large-scale data storage, a large amount of data stored in the external memory disk, while the external disk read/write blocks (block) in the data, the first need to locate a block in the disk, how to effectively find the data on the disk, need a reasonable and efficient external memory data structure. This structure can make the IO count as low as possible during the lookup process.

B-Tree

B-Tree is also called balanced multi-path search tree.

Each node in the B-tree can contain a large number of keyword information and branches according to the actual situation (of course, it cannot exceed the size of the disk block, depending on disk drives, the size of the general block is around 1k~4k); So the depth of the tree is reduced, This means finding an element as long as a few nodes are read into memory from the external memory disk and quickly accessing the data to be found. This is the advantage compared to the 2-fork tree.

For example, for the sake of simplicity, here is a small amount of data to construct a 3-fork tree form, the actual application of the B-tree node in a lot of keywords. The above figure, such as the root node, where 17 represents a file name for a disk, the Red square indicates where the contents of the 17 file are stored on the hard disk, and P1 represents a pointer to the 17 left subtree.

Below, let's simulate the process of finding file 29:

(1) Locate the root disk Block 1 of the file directory according to the root node pointer, and import the information into the memory. "Disk IO operation 1 times"

(2) In memory, there are two file names 17,35 and three data that store other disk page addresses. According to the algorithm we find 17<29<35, so we find the pointer p2.

(3) According to the P2 pointer, we locate the disk Block 3 and import the information into the memory. "Disk IO operation 2 times"

(4) In memory, there are two file names 26,30 and three data that store other disk page addresses. According to the algorithm we find 26<29<30, so we find the pointer p2.

(5) According to the P2 pointer, we locate the disk Block 8 and import the information into the memory. "Disk IO operation 3 times"

(6) There are two filenames in memory at this time 28, 29. According to the algorithm we find the file 29, and locate the disk address of the file memory.

Analyzing the above procedure, it is found that 3 disk IO operations and 3 memory lookup operations are required. As for the file name lookup in memory, because it is an ordered table structure, you can use binary lookup to improve efficiency. As for the 3 disk IO operations, the decision factors that affect the overall B-tree lookup efficiency.

Of course, if we use a balanced binary tree disk storage structure to find, disk IO operations at least 4 times, up to 5 times. And the more files, the smaller the number of disk IO operations used by the B-tree than the balanced binary tree, the higher the efficiency.

B + Tree

B+-tree is a b-tree tree that is generated by the file system.

The difference between a M-order B + tree and a M-order tree is:

1. N subtrees tree nodes contain n keywords, while B-tree is n subtrees tree has n-1 keyword)

2. All the leaf nodes contain information about all the keywords, and pointers to the records containing these keywords, and the leaf nodes themselves are linked by the size of the keywords from a large order of origin. (The leaf node of the B-tree does not include all the information it needs to find)

3. all non-terminal nodes can be considered as the index part , and the nodes contain only the largest (or smallest) keywords in the nodes of their sub-roots. (The non-final node of the B-tree also contains valid information that needs to be found)

Why can B + trees meet the requirements?

1) b+-tree disk read and write cost less

The internal node of the B+-tree does not have pointers to specific information about the keyword. Thus its internal nodes are smaller than the B-trees. If you keep all of the same internal nodes in the same disk block, the number of keywords that the disk block can hold is more. The more keywords you need to find when you read into memory at once. The number of Io reads and writes is correspondingly lower.

For example, suppose that a disk block in a disc holds 16bytes, while a keyword of 2bytes, a keyword specific information pointer 2bytes. An internal node of a 9-order B-tree (a node with a maximum of 8 keywords) requires 2 disks fast. and the B + tree internal nodes only need 1 disks fast. When an internal node needs to be read into memory, the B-tree is one more block-lookup time (the disk is the time of disc rotation) than a B + tree.

2) B+-tree query efficiency is more stable

Because a non-endpoint is not a node that ultimately points to the contents of a file, it is only the index of the keyword in the leaf node. So any keyword search must take a path from the root node to the leaf node. The path length of all keyword queries is the same, resulting in a query efficiency equivalent for each data.

B*-tree

B*-tree is a variant of B+-tree , in addition to the non-root and non-leaf nodes of the B + tree to increase the pointer to the brother; the b* tree defines the number of non-leaf node keywords at least (2/3) *m, that is, the minimum usage of the block is 2/3 (instead of 1/2 of the B + tree). A simple example is given, as shown in:

B + Tree Division: When a node is full, assign a new node, and copy 1/2 of the data from the original node to the new node, and finally add a pointer to the new node in the parent node; The division of the tree only affects the original node and the parent node, without affecting the sibling node, so it does not need to point to the sibling's pointer.

b*: When a node is full, if its next sibling node is not full, then move part of the data to the sibling node, insert the keyword at the original node, and finally modify the keyword of the sibling node in the parent node (because the sibling node's keyword range has changed); If the brothers are full, A new node is added between the original node and the sibling node, each copying 1/3 of the data to the new node, and finally adding pointers to the new node at the parent node.

Therefore, the probability of allocating new nodes to b* tree is lower than that of B + tree, and the space utilization rate is higher.

Summarize

Through the above introduction, roughly the B-tree, plus tree, b* tree summarized as follows:

B-Tree: ordered array + balanced multi-fork tree; data exists on non-leaf nodes

B + Tree: Ordered array list + balanced multi-fork tree; data exists only on leaves.

b* tree: A plump B + tree.

Liang Bin, the author of the search engine, gave his opinion on the B-tree and the + + Tree:

"B + Tree has one of the biggest benefits, easy to sweep the library, the second tree must use the method of sequential traversal in order to sweep the library, and the second-B tree directly from the leaf node to sweep over the end, B + Tree support Range-query is very convenient, and the tree is not supported. This is the main reason why the database chooses B + trees.

For example, to check between 5-10, B + Tree one to 5 this mark, and then to 10, and then string up on the line, the second tree is very troublesome. The advantage of the B-tree is that successful queries are particularly advantageous because the height of the tree is higher overall than the B + tree. In unsuccessful cases, B-trees are a little bit cheaper than a + + tree. B-trees like your example, 17, one will get the result.
There are a lot of frequency-based search is the choice of B-tree, the more frequent the node of the query to go to the root, if you need to do statistics on the query, and to make some changes to key. In addition B-tree is better than the second-level tree or the root or the above several layers because it is repeatedly query, so these blocks are basically in memory, will not appear read disk IO, generally started, will be actively swapped into memory. ”

File storage B + Tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.