B-Tree Learning Summary

Source: Internet
Author: User

The basic introduction of 1,b Tree

①b tree, compared to the binary tree, red and black trees, it is characterized by the height of the tree is much lower than other types of trees . How to do low? The number of branches in each node in the B-tree is very large, that is, there are many children nodes at each node. Thus, in the case of the same number of nodes, the B-tree is lower than the height of the other trees.

The ②b tree is a data structure designed for disk (external memory). Because of limited memory, huge amounts of data are stored on disk, disk access time is much greater than the internal access time, when accessing the data on the disk, how to reduce access to read and write disk number of times?

First understand the structure of the disk: reference hard disk read and write principle

When a disk is actually read, each read does not read only one data item, but one or more pages at a time,----this is the application of the principle of locality.

That is, "the length of the read-ahead is generally the integer multiple of the page." Page is the logical block of Computer Management memory, hardware and operating system tend to divide main memory and disk storage area into contiguous size equal blocks, each storage block is called a page (in many operating systems, the page size is usually 4k), main memory and disk in the page to exchange data. When the program to read the data is not in main memory, will trigger a page fault, the system will send a read signal to the disk, the disk will find the starting position of the data and sequentially read one or several pages back into memory, and then return unexpectedly, the program continues to run. "----This is the function of computer memory management.

In B-tree applications, the amount of data that needs to be processed is huge, the data cannot be loaded into main memory at once, the B-tree finds the pages it needs and copies it to main memory, and then writes the modified page back to disk. Therefore, the node size of the B-tree is usually equivalent to a disk page size. Because the B-tree height is very small, (for example, each node contains 1000 keywords, a height of 2 B-tree, a total of 1 billion keywords can be included.) When looking for a node in the B-tree, the number of lookups (tree height) is very small.

Each node of the B-tree "corresponds" to each page of the disk, and the number of disks accessed is less.

Why do 2,b tree/b+ trees be used as indexes on databases?

Reference:B-Tree and B + Tree applications: Data Search and database indexing

An introduction to the index:

Fortunately, the development of computer science provides a number of better search algorithms, such as binary search, binary tree search, binary search, and so on. If you look at it a little bit, you will find that each lookup algorithm can only be applied to a particular data structure, such as a binary lookup requires an orderly retrieval of data, while a binary tree lookup can only be applied to a binary lookup tree, but the data itself cannot be fully organized to meet a variety of data structures (for example, It is theoretically impossible to organize both columns sequentially, so in addition to the data, the database system maintains a data structure that satisfies a particular lookup algorithm that references (points to) data in some way, so that an advanced find algorithm can be implemented on those data structures. This data structure is the index .

An index is a structure that sorts the values of one or more columns in a database table. Compared to searching all rows in a table, the index uses pointers to data values stored in the specified columns in the table, and then arranges the pointers in the order specified to help get information faster. Typically, you need to create an index on a table only when you frequently query the data in an indexed column. The index consumes disk space and affects the speed of data updates. However, in most cases, the data retrieval speed advantage of index is much more than its disadvantage. ”

In other words, the amount of data stored in a database is very, very large, but it is not possible to organize the data into a form that satisfies the requirements of a particular condition (in memory, order, address succession). In this case, how to get the data faster? That's what the index does.

The B + tree is suitable for database indexing, and it is estimated that it takes only a few times to access the disk to find data from a large amount of data.

Resources Summary:

From database storage, file structure talking about B-tree, hash

The reading and writing principle of hard disk

Talking about R-trees from B-tree, b* tree,

B-Tree and B + Tree applications: Data Search and database indexing

Elementary introduction to algorithm and data structure (10): Balance Tree B Tree

B-Tree Learning Summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.