Ru Zhiqiang said in "zookeeper", "Looking at Hongying again, it was not even a half-hour's work that would be delayed, and she never saw her eat a snack, at first glance, you will know that it is a person who will be careful and live." Some people once ridiculed that a married identity is the most suitable for DBA. After all, if you don't know how expensive the firewood is, and the resource capacity is purchased at the end of the year, you will eat a thousand meals, don't eat. In addition, the case that the OS-layer APP directly breaks down the DB during fault diagnosis and performance optimization is also rare. Therefore, it is common to think about it. That is to say, it is difficult for DBAs to establish a good reputation.
In the big data era, thrift is a traditional virtue of DBAs. It is important to carefully select a data type. The reason is as follows:
● Computing, reducing the CPU load
(I) How many M are there in three data types: 1. INT (M?
M is 11 by default, and the maximum valid display width is 255. No matter how large M is, INT must be 4 bytes. M only indicates the display width, regardless of the storage size or the range of the value of the type. From the zerofill attribute, M is meaningless. Simply put, it is about to display the number of characters and the humanization point. For storage and computing, INT (11) and INT (255) are the same.
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 31 32 33 34 35 36 38 39 40 41 42 43 44 45 46 48 49 50 51 52 53 54 56 57 58 59 60 61 62 |
<Strong> mysql> create table t (id int (2); Query OK, 0 rows affected (0.08 sec) mysql> insert into t select 10086; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.01 sec) mysql> alter table t change column id int (16); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.00 sec) mysql> alter table t change column id int (16) zerofill; query OK, 1 row affected (0.19 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------------------ + | id | + ------------------ + | 0000000000010086 | + ------------------ + 1 row in set (0.00 sec) mysql> alter table t change column id int (5) zerofill; query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.00 sec) mysql> alter table t change column id int (6) zerofill; query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + -------- + | id | + -------- + | 010086 | + -------- + 1 row in set (0.00 sec) </strong> |
2. Calculate the maximum value of VARCHAR (N) N.
This has an interview question: if a table has only one field of VARCHAR (N) type and utf8 encoding, what is the maximum value of N?
We are not in a rush to calculate. Let's take a look at the following considerations:
● The maximum length of a row is 65535, except for the NDB engine. This limits the number of columns. For example, if char (255) utf8 is used, the maximum number of columns is 65535/(255*3) = 85. Here we can get a reference for the number of columns.
● Character Set Problems
Latin1: occupies one byte
Gbk: each character occupies a maximum of 2 bytes.
Utf8: each character occupies a maximum of three bytes.
● Length list
The actual character length needs to be stored in the length list: less than 255 is 1 byte, and more than 255 is 2 bytes
● 1 byte/row overhead
When latin1 is used for character sets, limit 3 should have 65533 length available. However:
?
1 2 3 4 |
<Strong> mysql> create table max_len_varchar (col varchar (65533) charset latin1); ERROR 1118 (42000): Row size too large. the maximum row size for the used table type, not counting BLOBs, It is 65535. this includes des storage overhead, check the manual. you have to change some columns to TEXT or BLOBs mysql> create table max_len_varchar (col varchar (65532) charset latin1); Query OK, 0 rows affected (0.16 sec) </strong> |
Therefore, in MySQL, the actual storage should start from 2nd bytes.
So far, we can easily get the first answer: (65535-1-2)/3. End this section with an interview question:
Create table t (col1 int (11), col2 char (50), col3 varchar (N) charset = utf8; N Max here? If you are interested, you can calculate it by yourself.
3 timestamp
Let's take a look at the bug of MySQL datetime:
?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<Strong> mysql> create table t (start_time datetime, stop_time datetime); Query OK, 0 rows affected (0.12 sec) mysql> insert into t (start_time, stop_time) values ("21:46:18", "00:21:31"); Query OK, 1 row affected (0.02 sec) mysql> select start_time, stop_time, stop_time-start_time from t; + duration + | start_time | stop_time-start_time | + duration + | 21:46:18 | 00:21:31 | 787513 | + ----------------------- + ------------------- + duration + 1 row in set (0.00 sec) </strong> |
Because the datetime type does not support direct calculation, the result is obtained only when the time is converted to a number to subtract. In addition to this bug, we should also try to use timestamp as much as possible. After all, from the storage point of view, timestamp only occupies 4 bytes, compared with datetime (8 bytes) and date (8 bytes) high space efficiency. In addition, some people use int unsigned to store a time value converted to a Unix timestamp, but this does not bring any benefit. MySQL provides from_unixtime () convert a Unix timestamp to a date and a unix_timestamp () to a Unix timestamp. Therefore, we do not need to stick to this habit because timestamp is actually a four-byte INT value, use the default time zone of the system. Different timestamps are obtained for the same string value, which is even worse.
The behavior rules of timestamp are complex, and there will be changes in different versions of MySQL, so sometimes "empirical" will make people kicker, so we should verify that the database behavior is what you need, A good practice is to use the show create table command to check the output after the timestamp column is modified. The following table shows the timestamp of the same DDL statement in different versions.
?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<Strong> create table t (col timestamp); 5.1: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP layer 5.5 is: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 5.6 is: 'col' timestamp null default null </strong> |
It can be seen that the change of timestamp in version 5.6 is earth-shaking.
With the increasingly fierce economic globalization, it is common to put data in different time zones. Mysqldump is the most common option for creating logical backup of data and schema. However, when we open the dump file header, we will find "/*! 40103 SET TIME_ZONE = '+. The Default Time Zone of our client is:
?
1 2 3 4 5 6 7 |
<Strong> mysql> select @ time_zone; + ------------- + | @ time_zone | + ------------- + | SYSTEM | + ------------- + 1 row in set (0.00 sec) </strong> |
?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<Strong> mysql> drop table if exists t; mysql> create table t (col timestamp); mysql> insert into t select now (); mysql> select * from t; + --------------------- + | col | + --------------------- + | 10:42:44 | + ----------------------- + 1 row in set (0.00 sec) </strong> |
$ Mysqldump-uroot-poracle testdb t -- where = 'col = "10:42:44" '| grep INSERT
The returned result is null and no data can be exported? Two solutions are provided below
Method 1 Add the parameter -- tz-utc
$ Mysqldump-uroot-p testdb t -- tz-utc = 0 -- where = 'col = "10:42:44" '| grep INSERT
Method 2 use conversion functions
?
1 2 3 4 5 6 |
<Strong> mysql> select unix_timestamp (col) from t; + ------------------- + | unix_timestamp (col) | + --------------------- + | 1390617764 | + ------------------- + </strong> |
$ Mysqldump-uroot-p testdb t -- where = 'col = from_unixtime (1390617764) '| grep INSERT
Insert into 't'values ('2017-01-25 02:42:44 ');
(Ii) Data Type Conversion
Basic Principles:
① Do not add a function before the field
For example:
② Do not embed fields into expressions
For example, suppose that I create an index on the Character column, and then:
(Iii) Data Type optimization
Below are some general principles for optimizing data types:
1. Smaller data types are usually better. The simpler the data type, the better.
5. Use enum and set
6. Use int to store IP addresses: inet_aton () and inet_ntoa ()
7. Use decimal instead of float & double
8. MyISAM uses char and InnoDB and varchar.