Catalogue (technical article)
More about the index, divided into the following points to explain:
I. Overview of indexes (what are indexes, advantages and disadvantages of indexes)
Second, the basic use of the index (CREATE INDEX)
Third, the basic principles of the index (interview focus)
IV. data structure of the index (b-tree, hash)
Five, the principle of creating the index (the most serious, interview must ask!) Please collect! )
Vi. How to delete data from millions or above
I. Overview of the Index
1) What is an index?
An index is a special kind of file (an index on a InnoDB data table is an integral part of a table space), and they contain reference pointers to all records in the datasheet. More generally, the index is the equivalent of a directory. When you are using the Xinhua dictionary, to help you tear down the directory, you query the beginning of a word idiom can only be turned from the first page to the 1000th page. Tired! Give the catalogue to you, you can quickly locate!
2) Advantages and disadvantages of the index:
Can greatly speed up the retrieval of data, which is the main reason for creating indexes. , and by using an index, you can improve the performance of your system by using an optimized hidden device during the query. However, indexes also have drawbacks: indexes require additional maintenance costs, and because index files are separate files, adding, modifying, and deleting data can result in additional operations on the index file, which consumes additional IO and reduces the efficiency of the increase/change/delete execution.
Second, the basic use of the index (true technical text)
1) CREATE INDEX: (Three ways)
The first way:
The second way: Use the ALTER TABLE command to increase the index:
ALTER table is used to create a normal index, a unique index, or a primary key index.
Where table_name is the name of the table to increase the index, column_list indicates which columns to index, and columns are separated by commas.
The index name index_name can be named by itself, and by default, MySQL assigns a name based on the first indexed column. In addition, ALTER TABLE allows you to change multiple tables in a single statement, so you can create multiple indexes at the same time.
Third Way: Using the CREATE INDEX command
CREATE Index to add a normal or unique index to a table. (However, the primary key index cannot be created)
Third, the basic principle of the index (do not want to like other articles as a lot of space nonsense)
Indexes are used to quickly look for records that have specific values. If you do not have an index, you generally traverse the entire table when you execute the query.
The principle of indexing is simple, which is to turn unordered data into an ordered query
1. Sort the contents of the column that created the index
2. Create inverted table for sorting results
3. The data address chain is spelled on the inverted list contents.
4, in the query, the first to get the contents of the inverted list, and then take out the data address chain, so as to get specific data
IV. data structure of the index (b-tree, hash)
1) B-Tree Index
MySQL uses the storage engine to fetch data, basically 90% of the people use InnoDB, according to the implementation of the way, InnoDB index type currently only two: BTREE (b-tree) index and hash index. The B-Tree index is the most frequently used index type in the MySQL database, and all basic storage engines support the Btree index. Usually we say that the index is not accidental refers to (b-tree) index (actually implemented in B + tree, because when viewing the table index, MySQL all print btree, so short as the B-Tree index)
Query method:
Primary key index Area: PI (the address of the associated saved data) is queried by the primary key,
Normal index area: si (the address of the associated ID, and then to the address above). So press the primary key to find the fastest speed
B+tree Property:
1.) n subtrees tree node contains n keywords that do not save the data but instead save the index of the data.
2.) All of the leaf nodes contain information about all the keywords and pointers to the key records, and the leaf nodes themselves are linked in large order by the size of the keywords.
3.) All non-terminal nodes can be viewed as an indexed portion, with only the largest (or smallest) keyword in its subtree.
4.) B + Tree, the insertion and deletion of data objects is done only on the leaf nodes.
5.) B + Tree has 2 head pointers, one is the root node of the tree, and the other is the leaf node of the minimum key code.
2) hash index (good technical text)
Briefly, similar to the simple implementation of the data structure of the hash table (hash list), when we use hash index in MySQL, mainly through the hash algorithm (the common hash algorithm has the direct addressing method, the square take the method, the folding method, the divisor to take the remainder method, the random number method), The database field data is converted into a fixed-length hash value, and the row pointer of this data is stored in the corresponding position of the hash table, and if a hash collision occurs (the hash value of two different keywords is the same), then the corresponding hash key is saved in the form of a list. Of course, this is just a brief simulation.
PS: About the data structure, interested in-depth friends can pay attention to me after viewing the "Data Structure" topic, here do not do a detailed explanation.
Five, the principle of creating indexes (the most serious)
The index is good, but it is not unrestricted use, it is best to meet a few principles
1) The leftmost prefix matching principle, the combination index is very important principle, MySQL will always match to the right until the scope query (>, <, between, like) stop matching, such as a = 1 and B = 2 and C > 3 and D = 4 if established (A, B,C,D) The index of the order, D is not indexed, if the establishment (A,B,D,C) of the index can be used, a,b,d order can be arbitrarily adjusted.
2) A field that is more frequently used as a query condition to create an index
3) Update frequent fields are not suitable for creating indexes
4) If the column that does not effectively distinguish the data is not suitable for the index column (such as gender, male and female unknown, up to three kinds, the sensitivity is too low)
5) Expand the index as much as possible and do not create a new index. For example, the table already has an index of a, now to add (A, b) of the index, then only need to modify the original index.
6) The data column that defines the foreign key must be indexed.
7) For columns that are rarely involved in those queries, duplicate values are not indexed for more columns.
8) do not index columns for data types that are defined as text, image, and bit.
How to delete data from millions of levels or more (really good technical text)
About indexes: Because indexes require additional maintenance costs because index files are separate files, when we add, modify, and delete data, we create additional operations on the index files that consume additional IO and reduce the efficiency of the increase/change/delete execution. So, when we delete database millions data, query the official MySQL manual to learn that the speed of deleting data is proportional to the number of indexes created.
So when we want to delete millions of data, we can delete the index first (it takes about three minutes)
Then delete the unused data (this process takes less than two minutes)
The index is re-created when the deletion is complete (less data at this time), and it is also very fast, about 10 minutes or so.
With the previous direct deletion is definitely a lot faster, let alone in case of deletion of interrupts, all deletions will be rolled back. That's more of a pit.
"mysql Optimization Topics"90% Programmer's interview. Index optimization Manual (5)