MySQL Schema Design (ii) careful calculation, continuous oil and salt

Source: Internet
Author: User
Tags time zones

Ru Zhiqiang said in "zookeeper", "Looking at Hongying again, it was not even a half-hour's work that would be delayed, and she never saw her eat a snack, at first glance, you will know that it is a person who will be careful and live." Some people once ridiculed that a married identity is the most suitable for DBA. After all, if you don't know how expensive the firewood is, and the resource capacity is purchased at the end of the year, you will eat a thousand meals, don't eat. In addition, the case that the OS-layer APP directly breaks down the DB during fault diagnosis and performance optimization is also rare. Therefore, it is common to think about it. That is to say, it is difficult for DBAs to establish a good reputation.


In the big data era, thrift is a traditional virtue of DBAs. It is important to carefully select a data type. The reason is as follows:
● Storage (memory, disk), thus saving I/O (retrieval of the same data)

● Computing, reducing the CPU load


(I) Three Data Types
1. How many M does INT (M) exist?


M is 11 by default, and the maximum valid display width is 255. No matter how large M is, INT must be 4 bytes. M only indicates the display width, regardless of the storage size or the range of the value of the type. From the zerofill attribute, M is meaningless. Simply put, it is about to display the number of characters and the humanization point. For storage and computing, INT (11) and INT (255) are the same.

mysql> create table t (id int(2));Query OK, 0 rows affected (0.08 sec)mysql> insert into t select 10086;Query OK, 1 row affected (0.01 sec)Records: 1 Duplicates: 0 Warnings: 0mysql> select * from t;+-------+| id |+-------+| 10086 |+-------+1 row in set (0.01 sec)mysql> alter table t change column id id int(16);Query OK, 0 rows affected (0.02 sec)Records: 0 Duplicates: 0 Warnings: 0mysql> select * from t;+-------+| id |+-------+| 10086 |+-------+1 row in set (0.00 sec)mysql> alter table t change column id id int(16) zerofill;Query OK, 1 row affected (0.19 sec)Records: 1 Duplicates: 0 Warnings: 0mysql> select * from t;+------------------+| id |+------------------+| 0000000000010086 |+------------------+1 row in set (0.00 sec)mysql> alter table t change column id id int(5) zerofill;Query OK, 0 rows affected (0.02 sec)Records: 0 Duplicates: 0 Warnings: 0mysql> select * from t;+-------+| id |+-------+| 10086 |+-------+1 row in set (0.00 sec)mysql> alter table t change column id id int(6) zerofill;Query OK, 0 rows affected (0.01 sec)Records: 0 Duplicates: 0 Warnings: 0mysql> select * from t;+--------+| id |+--------+| 010086 |+--------+1 row in set (0.00 sec)


2. Calculate the maximum value of VARCHAR (N) N.


This has an interview question: if a table has only one field of VARCHAR (N) type and utf8 encoding, what is the maximum value of N?

We are not in a rush to calculate. Let's take a look at the following considerations:

● The maximum length of a row is 65535, except for the NDB engine. This limits the number of columns. For example, if char (255) utf8 is used, the maximum number of columns is 65535/(255*3) = 85. Here we can get a reference for the number of columns.

● Character Set Problems

Latin1: occupies one byte

Gbk: each character occupies a maximum of 2 bytes.

Utf8: each character occupies a maximum of three bytes.

● Length list

The actual character length needs to be stored in the length list: less than 255 is 1 byte, and more than 255 is 2 bytes

● 1 byte/row overhead

When latin1 is used for character sets, limit 3 should have 65533 length available. However:

mysql> create table max_len_varchar(col varchar(65533) charset latin1);ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBsmysql> create table max_len_varchar(col varchar(65532) charset latin1);Query OK, 0 rows affected (0.16 sec)

Therefore, in MySQL, the actual storage should start from 2nd bytes.

So far, we can easily get the first answer: (65535-1-2)/3. End this section with an interview question:

Create table t (col1 int (11), col2 char (50), col3 varchar (N) charset = utf8; N Max here? If you are interested, you can calculate it by yourself.


3 timestamp


Let's take a look at the bug of MySQL datetime:

mysql> create table t (start_time datetime,stop_time datetime);Query OK, 0 rows affected (0.12 sec)mysql> insert into t (start_time, stop_time) values ("2014-01-19 21:46:18", "2014-01-20 00:21:31");Query OK, 1 row affected (0.02 sec)mysql> select start_time, stop_time, stop_time - start_time from t;+---------------------+---------------------+------------------------+| start_time | stop_time | stop_time - start_time |+---------------------+---------------------+------------------------+| 2014-01-19 21:46:18 | 2014-01-20 00:21:31 | 787513 |+---------------------+---------------------+------------------------+1 row in set (0.00 sec)

Because the datetime type does not support direct calculation, the result is obtained only when the time is converted to a number to subtract. In addition to this bug, we should also try to use timestamp as much as possible. After all, from the storage point of view, timestamp only occupies 4 bytes, compared with datetime (8 bytes) and date (8 bytes) high space efficiency. In addition, some people use int unsigned to store a time value converted to a Unix timestamp, but this does not bring any benefit. MySQL provides from_unixtime () convert a Unix timestamp to a date and a unix_timestamp () to a Unix timestamp. Therefore, we do not need to stick to this habit because timestamp is actually a four-byte INT value, use the default time zone of the system. Different timestamps are obtained for the same string value, which is even worse.


The behavior rules of timestamp are complex, and there will be changes in different versions of MySQL, so sometimes "empirical" will make people kicker, so we should verify that the database behavior is what you need, A good practice is to use the show create table command to check the output after the timestamp column is modified. The following table shows the timestamp of the same DDL statement in different versions.

Create table t (col timestamp); 5.1: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP5.5 layer is: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP5.6 is: 'col' timestamp NULL DEFAULT NULL

It can be seen that the change of timestamp in version 5.6 is earth-shaking.


With the increasingly fierce economic globalization, it is common to put data in different time zones. The most common option for creating logical backup of data and schema is mysqldump. However, when we open the dump file header, we will find that"/*! 40103 SET TIME_ZONE = '+ '*/;"Such a line. The Default Time Zone of our client is:

mysql> select @@time_zone;+-------------+| @@time_zone |+-------------+| SYSTEM |+-------------+1 row in set (0.00 sec)

This SYSTEM indicates that MySQL uses the default time zone of the operating SYSTEM, so it is the East 8 zone. The value displayed in timestamp is dependent on the time zone. MySQL server, OS, and client connections have time zone settings. Perform the following simple test:

mysql> drop table if exists t;mysql> create table t (col timestamp);mysql> insert into t select now();mysql> select * from t;+---------------------+| col |+---------------------+| 2014-01-25 10:42:44 |+---------------------+1 row in set (0.00 sec)

$ Mysqldump-uroot-poracle testdb t -- where = 'col = "10:42:44" '| grep INSERT

The returned result is null and no data can be exported? Two solutions are provided below

Method 1 Add the parameter -- tz-utc

$ Mysqldump-uroot-p testdb t -- tz-utc = 0 -- where = 'col = "10:42:44" '| grep INSERT

Insert into 't'values ('2017-01-25 10:42:44 ');

Method 2 use conversion functions

mysql> select unix_timestamp(col) from t;+---------------------+| unix_timestamp(col) |+---------------------+| 1390617764 |+---------------------+

$ Mysqldump-uroot-p testdb t -- where = 'col = from_unixtime (1390617764) '| grep INSERT

Insert into 't'values ('2017-01-25 02:42:44 ');


(Ii) Data Type Conversion

Basic Principles:

(1) Fields of all Where conditions do not use functions for type conversion. This is mainly to avoid index column contamination. Here we provide two suggestions:

① Do not add a function before the field

For example:
To_char (start_time, 'yyyy. mm. dd') between '2017. 123' and '2017. 123'
And
Start_time between to_date ('192. 123', 'yyyy. mm. dd') and to_date ('192. 123', 'yyyy. mm. dd ')
It should be the second type at any time !!!

② Do not embed fields into expressions

For example:
Start_time + 7 <sysdate;
And
Start_time <sysdate-7


(2) implicit conversion

For example, suppose that I create an index on the Character column, and then:

Select * from t where index_column = 8;

This statement is actually equivalent:

Select * from t where to_number (index_column) = 8;

Be sure to avoid implicit conversion as much as possible. Just compare yourself with yourself, and let the "children of other people's families" go on their own.


(3) In a table connection Query, if the data types at both ends of the connection condition are inconsistent, the Data Type of the connection condition of the driver table must be converted to the Data Type consistent with that of the passive table.


(Iii) Data Type optimization


Below are some general principles for optimizing data types:

1. Smaller data types are usually better. The simpler the data type, the better.
2. Numeric operations are faster than character operations, while small data types are faster than large data types.
3. The processing efficiency of fixed-length and variable-length data tables is different.
4. Avoid NULL as much as possible. If NULL is necessary, use 0 instead.

5. Use enum and set

6. Use int to store IP addresses: inet_aton () and inet_ntoa ()

7. Use decimal instead of float & double

8. MyISAM uses char and InnoDB and varchar.


By Chiba

Good Luck!



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.