This article mainly introduces the primarykey function in MySQL, including its impact on InnoDB usage, you can refer to the optimizer 5.1.46 to make some changes to the choice of the primary key:
Performance: While looking for the shortest index for a covering index scan, the optimizer did not consider the full row length for a clustered primary key, as in InnoDB. secondary covering indexes will now be preferred, making full table scans less likely.
The find_shortest_key function is added in this version. the function can be considered as the minimum key length.
Index to meet our query requirements.
How this function works:
The code is as follows:
What find_shortest_key shoshould do is the following. If the primary key is a covering index
And is clustered, like in MyISAM, then the behavior today shoshould remain the same. If
Primary key is clustered, like in InnoDB, then it shoshould not consider using the primary
Key because then the storage engine will have to scan through much more data.
Call Primary_key_is_clustered (). If the returned value is true, run find_shortest_key: Select Secondary covering indexes, which is the smallest key length.
First, test in 5.1.45:
$mysql -Vmysql Ver 14.14 Distrib 5.1.45, for unknown-linux-gnu (x86_64) using EditLine wrapperroot@test 03:49:45>create table test(id int,name varchar(20),name2 varchar(20),d datetime,primary key(id)) engine=innodb;Query OK, 0 rows affected (0.16 sec)root@test 03:49:47>insert into test values(1,'xc','sds',now()),(2,'xcx','dd',now()),(3,'sdds','ddd',now()),(4,'sdsdf','dsd',now()),(5,'sdsdaa','sds',now());Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0root@test 03:49:51>root@test 03:49:51>insert into test values(6,'xce','sdsd',now()),(7,'xcx','sdsd',now()),(8,'sdds','sds',now()),(9,'sdsdsdf','sdsdsd',now()),(10,'sdssdfdaa','sdsdsd',now());Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0
Create index ind_1:
root@test 03:49:53>alter table test add index ind_1(name,d);Query OK, 0 rows affected (0.09 sec)Records: 0 Duplicates: 0 Warnings: 0root@test 03:50:08>explain select count(*) from test;+—-+————-+——-+——-+—————+———+———+——+——+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——-+—————+———+———+——+——+————-+| 1 | SIMPLE | test | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |+—-+————-+——-+——-+—————+———+———+——+——+————-+1 row in set (0.00 sec)
Add ind_2:
root@test 08:04:35>alter table test add index ind_2(d);Query OK, 0 rows affected (0.07 sec)Records: 0 Duplicates: 0 Warnings: 0root@test 08:04:45>explain select count(*) from test;+—-+————-+——-+——-+—————+———+———+——+——+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——-+—————+———+———+——+——+————-+| 1 | SIMPLE | test | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |+—-+————-+——-+——-+—————+———+———+——+——+————-+1 row in set (0.00 sec)
In the above version [5.1.45], we can see that the optimizer selects the primary key for scanning, and does not use ind_1 or ind_2 to complete the query;
Next step: 5.1.48
$mysql -Vmysql Ver 14.14 Distrib 5.1.48, for unknown-linux-gnu (x86_64) using EditLine wrapperroot@test 03:13:15> create table test(id int,name varchar(20),name2 varchar(20),d datetime,primary key(id)) engine=innodb;Query OK, 0 rows affected (0.00 sec)root@test 03:48:04>insert into test values(1,'xc','sds',now()),(2,'xcx','dd',now()),(3,'sdds','ddd',now()),(4,'sdsdf','dsd',now()),(5,'sdsdaa','sds',now());Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0root@test 03:48:05>insert into test values(6,'xce','sdsd',now()),(7,'xcx','sdsd',now()),(8,'sdds','sds',now()),(9,'sdsdsdf','sdsdsd',now()),(10,'sdssdfdaa','sdsdsd',now());Query OK, 5 rows affected (0.01 sec)Records: 5 Duplicates: 0 Warnings: 0
Create index ind_1:
Root @ test 03:13:57> alter table test add index ind_1 (name, d); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0root @ test 03:15:55> explain select count (*) from test; + -- + ----- + --- + -- + ----- + | id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra | + -- + ----- + --- + -- + ----- + | 1 | SIMPLE | test | index | NULL | ind_1 | 52 | NULL | 10 | Using index | + -- + ----- + --- + -- + ----- + root @ test 08:01:56> alter table test add index ind_2 (d); query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 Add ind_2: root @ test 08:02:09> explain select count (*) from test; + -- + ----- + --- + -- + ----- + | id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra | + -- + ----- + --- + -- + ----- + | 1 | SIMPLE | test | index | NULL | ind_2 | 9 | NULL | 10 | Using index | + -- + ----- + --- + -- + ----- + 1 row in set (0.00 sec)
In version 5.1.48, ind_1 is selected wisely to complete the scan. the primary key (full index scan) is not used to complete the query. then, ind_2 is added, because the length of the ind_1 key is greater than the length of the ind_2 key, mysql selects a better ind_2 to complete the query. it can be seen that mysql is gradually intelligent in the selection method.
Observed performance:
5.1.48root@test 08:49:32>set profiling =1;Query OK, 0 rows affected (0.00 sec)root@test 08:49:41>select count(*) from test;+———-+| count(*) |+———-+| 5242880 |+———-+1 row in set (1.18 sec)root@test 08:56:30>show profile cpu,block io for query 1;+——————————–+———-+———-+————+————–+—————+| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |+——————————–+———-+———-+————+————–+—————+| starting | 0.000035 | 0.000000 | 0.000000 | 0 | 0 || checking query cache for query | 0.000051 | 0.000000 | 0.000000 | 0 | 0 || Opening tables | 0.000014 | 0.000000 | 0.000000 | 0 | 0 || System lock | 0.000005 | 0.000000 | 0.000000 | 0 | 0 || Table lock | 0.000010 | 0.000000 | 0.000000 | 0 | 0 || init | 0.000015 | 0.000000 | 0.000000 | 0 | 0 || optimizing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 || statistics | 0.000015 | 0.000000 | 0.000000 | 0 | 0 || preparing | 0.000012 | 0.000000 | 0.000000 | 0 | 0 || executing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 || Sending data | 1.178452 | 1.177821 | 0.000000 | 0 | 0 || end | 0.000016 | 0.000000 | 0.000000 | 0 | 0 || query end | 0.000005 | 0.000000 | 0.000000 | 0 | 0 || freeing items | 0.000040 | 0.000000 | 0.000000 | 0 | 0 || logging slow query | 0.000002 | 0.000000 | 0.000000 | 0 | 0 || logging slow query | 0.000086 | 0.000000 | 0.000000 | 0 | 0 || cleaning up | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |+——————————–+———-+———-+————+————–+—————+
Comparison performance:
5.1.45root@test 08:57:18>set profiling =1;Query OK, 0 rows affected (0.00 sec)root@test 08:57:21>select count(*) from test;+———-+| count(*) |+———-+| 5242880 |+———-+1 row in set (1.30 sec)root@test 08:57:27>show profile cpu,block io for query 1;+——————————–+———-+———-+————+————–+—————+| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |+——————————–+———-+———-+————+————–+—————+| starting | 0.000026 | 0.000000 | 0.000000 | 0 | 0 || checking query cache for query | 0.000041 | 0.000000 | 0.000000 | 0 | 0 || Opening tables | 0.000014 | 0.000000 | 0.000000 | 0 | 0 || System lock | 0.000005 | 0.000000 | 0.000000 | 0 | 0 || Table lock | 0.000008 | 0.000000 | 0.000000 | 0 | 0 || init | 0.000015 | 0.000000 | 0.000000 | 0 | 0 || optimizing | 0.000006 | 0.000000 | 0.000000 | 0 | 0 || statistics | 0.000014 | 0.000000 | 0.000000 | 0 | 0 || preparing | 0.000012 | 0.000000 | 0.000000 | 0 | 0 || executing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 || Sending data | 1.294178 | 1.293803 | 0.000000 | 0 | 0 || end | 0.000016 | 0.000000 | 0.000000 | 0 | 0 || query end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 || freeing items | 0.000040 | 0.000000 | 0.001000 | 0 | 0 || logging slow query | 0.000002 | 0.000000 | 0.000000 | 0 | 0 || logging slow query | 0.000080 | 0.000000 | 0.000000 | 0 | 0 || cleaning up | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |+——————————–+———-+———-+————+————–+—————+
From the above profile, we can see that the difference in Sending data is obvious. mysql does not need to scan the page block of the entire table, instead, it scans the index page blocks with the shortest index key in the table to complete the query, which reduces unnecessary data.
PS: innodb is a transaction engine. Therefore, in addition to storing records of this row, the leaf node also records transaction information (DB_TRX_ID, DB_ROLL_PTR, and so on ), therefore, the length of a single row is about 20 bytes. the most intuitive method is to convert myisam to innodb, and the storage space will increase significantly. If the primary table is t (id, name, pk (id), and the secondary index ind_name (name, id), it is easy to confuse, even if there are only two fields, the first index is larger than the second index (the internal structure of the table can be observed through innodb_table_monitor). when querying all IDs, the optimizer selects the second index ind_name.