2014 Ali intern face question--mysql How to achieve the index

Source: Internet
Author: User

 

This is 2014 Ali intern Beijing station Two side of a question:

In MySQL, indexes are the concept of storage engine level, and different storage engines implement indexes differently, such as the MyISAM and InnoDB storage engines.

MyISAM Index Implementation:

The MyISAM storage engine uses B+tree as the index structure, and the data domain of the leaf node holds the address of the record. The index of MyISAM is also called "non-aggregation", and the reason for this is to differentiate it from InnoDB's clustered index.

InnoDB Index Implementation:

Although InnoDB also uses B+tree as the index structure, the implementation is very different from MyISAM.

The first major difference is that the InnoDB data file itself is the index file.

The second difference from the MyISAM index is that the secondary index data field of the InnoDB stores the value of the corresponding record primary key instead of the address. Clustered index This implementation makes search by primary key efficient, but secondary index search needs to retrieve two times index: first retrieves the secondary index to obtain the primary key, and then retrieves the record with the primary key to the primary index.

In fact, the implementation of database index can use red-black tree, b-tree tree data structure.

But why actually use the b+tree?

This should start with the computer storage principle and operating system-related knowledge. Because the index of the data table is large and cannot reside in memory, it is stored as a file on disk. Therefore, I/O operations are required when querying data. The goal of high-efficiency queries is fewer I/O times. One I/O typically reads a page (typically 4k) of data (local principle). So, in the B-tree, whenever a new node is requested, it is applied by the size of the page. That is, I/O can read a single node (containing many key) data, while in the red and black tree structure, logically adjacent nodes are physically not necessarily adjacent, that is, reading the same data requires multiple I/O. So choosing B-tree is more efficient.

So why did you finally pick a B + tree?

Because the B + tree node removes the data domain, it can have a greater degree of granularity, meaning that a node can store more internal nodes, so I/O is more efficient.

Understanding how index implementations of different storage engines can be useful for proper use and optimization of indexes, such as knowing the InnoDB index implementation, makes it easy to understand why it is not recommended to use too-long fields as primary keys, because all secondary indexes refer to the primary index, and the long primary index makes the secondary index too large. For example, using non-monotonic fields as primary keys is not a good idea in InnoDB because the InnoDB data file itself is a b+tree, and a non-monotonic primary key causes the data file to be frequently split in order to maintain b+tree characteristics when inserting new records, which is inefficient, Using the self-increment field as the primary key is a good choice.

Clustered and Nonclustered indexes:

InnoDB is a clustered index because the leaf nodes of its B + tree contain complete data records. The leaf node of the MyISAM-B + Tree only stores the address of the data, so it is called a nonclustered index.

Index usage policy and optimization

MySQL optimization is mainly divided into structural optimization (Scheme optimization) and query Optimization (optimization).

This article permanently updates the link address : http://www.linuxidc.com/Linux/2014-04/100597.htm

2014 Ali intern face question--mysql How to achieve the index

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.