When I was reading CU today, I found someone asking about optimize's table optimization problem. It was a long time because of this problem. Today I am free to ask this question, we can use the actual data to show it, so that you can see with your own eyes that the important role of optimize table is not a plausible estimate.
I. Raw Data
1. Data Volume
Mysql> select count (*) as total from ad_visit_history;
+ --------- +
| Total |
+ --------- +
| 1187096 | // more than 1.18 million data entries in total
+ --------- +
1 row in set (0.04 sec)
2. Size of the table files stored in the hard disk
[Root @ www.bkjia.com test1] # ls | grep visit | xargs-I du {}
382020 ad_visit_history.MYD // data files account for 380 M
127116 ad_visit_history.MYI // index files account for 127 MB
12 ad_visit_history.frm // The structure file occupies 12 K
3. Check the index information.
Mysql> show index from ad_visit_history from test1; // you can view the index information of the table.
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
| Ad_visit_history | 0 | PRIMARY | 1 | id | A | 1187096 | NULL | BTREE |
| Ad_visit_history | 1 | ad_code | 1 | ad_code | A | 46 | NULL | YES | BTREE |
| Ad_visit_history | 1 | unique_id | 1 | unique_id | A | 1187096 | NULL | YES | BTREE |
| Ad_visit_history | 1 | ad_code_ind | 1 | ad_code | A | 46 | NULL | YES | BTREE |
| Ad_visit_history | 1 | from_page_url_ind | 1 | from_page_url | A | 30438 | NULL | YES | BTREE |
| Ad_visit_history | 1 | ip_ind | 1 | ip | A | 593548 | NULL | YES | BTREE |
| Ad_visit_history | 1 | port_ind | 1 | port | A | 65949 | NULL | YES | BTREE |
| Ad_visit_history | 1 | session_id_ind | 1 | session_id | A | 1187096 | NULL | YES | BTREE |
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
8 rows in set (0.28 sec)
Description of the columns in the index information.
Table: the name of the Table.
Non_unique: 0 if the index cannot contain duplicate words. If yes, it is 1.
Key_name: name of the index.
Seq_in_index: the column serial number in the index, starting from 1.
Column_name: column name.
Collation: How the column is stored in the index. In MySQLSHOW INDEX syntax, values 'A' (ascending) or NULL (unclassified) are available ).
Cardinality: the number of unique values in the index. You can update analyze table or myisamchk-a by running analyze table. The base number is counted based on the statistical data stored as an integer. Therefore, this value is not required to be accurate even for small tables. The larger the base, the larger the chance for MySQL to use the index for union.
Sub_part: If a column is partially indexed, it is the number of indexed characters. If the entire column is indexed, the value is NULL.
Packed: indicates how the keywords are compressed. If it is not compressed, It is NULL.
Null: If the column contains NULL, YES is included. If no, It is null.
Index_type: Method for storing index data structures (BTREE, FULLTEXT, HASH, RTREE)
Ii. delete half of the data
Mysql> delete from ad_visit_history where id> 598000; // delete half of the data
Query OK, 589096 rows affected (4 min 28.06 sec)
[Root @ www.bkjia.com test1] # ls | grep visit | xargs-I du {}// the corresponding MYD and MYI file size remain unchanged
382020 ad_visit_history.MYD
127116 ad_visit_history.MYI
12 ad_visit_history.frm
In general, if half of the data is deleted in the database, the corresponding. MYD and. MYI files should also be half of the previous one. However, after half of the data is deleted,. MYD. MYI has not been reduced by 1 kb, which is terrible.
Let's take a look at the index information.
Mysql> show index from ad_visit_history;
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
| Ad_visit_history | 0 | PRIMARY | 1 | id | A | 598000 | NULL | BTREE |
| Ad_visit_history | 1 | ad_code | 1 | ad_code | A | 23 | NULL | YES | BTREE |
| Ad_visit_history | 1 | unique_id | 1 | unique_id | A | 598000 | NULL | YES | BTREE |
| Ad_visit_history | 1 | ad_code_ind | 1 | ad_code | A | 23 | NULL | YES | BTREE |
| Ad_visit_history | 1 | from_page_url_ind | 1 | from_page_url | A | 15333 | NULL | YES | BTREE |
| Ad_visit_history | 1 | ip_ind | 1 | ip | A | 299000 | NULL | YES | BTREE |
| Ad_visit_history | 1 | port_ind | 1 | port | A | 33222 | NULL | YES | BTREE |
| Ad_visit_history | 1 | session_id_ind | 1 | session_id | A | 598000 | NULL | YES | BTREE |
+ ------------------ + ------------ + ------------------- + -------------- + --------------- + ----------- + ------------- + ---------- + -------- + -------------- + ----------- +
8 rows in set (0.00 sec)
In comparison, the data in this index query and the last index query is basically the last one, which is still reasonable.