MySQLSchema design (2)

Last Update:2014-02-09 Source: Internet

Author: User

Tags time zones

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Ru Zhiqiang said in "zookeeper", "Looking at Hongying again, it was not even a half-hour's work that would be delayed, and she never saw her eat a snack, at first glance, you will know that it is a person who will be careful and live." Some people once ridiculed that a married identity is the most suitable for DBA. After all, if you don't know how expensive the firewood is, and the resource capacity is purchased at the end of the year, you will eat a thousand meals, don't eat. In addition, the case that the OS-layer APP directly breaks down the DB during fault diagnosis and performance optimization is also rare. Therefore, it is common to think about it. That is to say, it is difficult for DBAs to establish a good reputation.

In the big data era, thrift is a traditional virtue of DBAs. It is important to carefully select a data type. The reason is as follows:

● Computing, reducing the CPU load

(I) How many M are there in three data types: 1. INT (M?

M is 11 by default, and the maximum valid display width is 255. No matter how large M is, INT must be 4 bytes. M only indicates the display width, regardless of the storage size or the range of the value of the type. From the zerofill attribute, M is meaningless. Simply put, it is about to display the number of characters and the humanization point. For storage and computing, INT (11) and INT (255) are the same.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 31 32 33 34 35 36 38 39 40 41 42 43 44 45 46 48 49 50 51 52 53 54 56 57 58 59 60 61 62

mysql> create table t (id int (2); Query OK, 0 rows affected (0.08 sec) mysql> insert into t select 10086; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.01 sec) mysql> alter table t change column id int (16); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.00 sec) mysql> alter table t change column id int (16) zerofill; query OK, 1 row affected (0.19 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------------------ + | id | + ------------------ + | 0000000000010086 | + ------------------ + 1 row in set (0.00 sec) mysql> alter table t change column id int (5) zerofill; query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + ------- + | id | + ------- + | 10086 | + ------- + 1 row in set (0.00 sec) mysql> alter table t change column id int (6) zerofill; query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select * from t; + -------- + | id | + -------- + | 010086 | + -------- + 1 row in set (0.00 sec)

2. Calculate the maximum value of VARCHAR (N) N.

This has an interview question: if a table has only one field of VARCHAR (N) type and utf8 encoding, what is the maximum value of N?

We are not in a rush to calculate. Let's take a look at the following considerations:

● The maximum length of a row is 65535, except for the NDB engine. This limits the number of columns. For example, if char (255) utf8 is used, the maximum number of columns is 65535/(255*3) = 85. Here we can get a reference for the number of columns.

● Character Set Problems

Latin1: occupies one byte

Gbk: each character occupies a maximum of 2 bytes.

Utf8: each character occupies a maximum of three bytes.

● Length list

The actual character length needs to be stored in the length list: less than 255 is 1 byte, and more than 255 is 2 bytes

● 1 byte/row overhead

When latin1 is used for character sets, limit 3 should have 65533 length available. However:

1 2 3 4

mysql> create table max_len_varchar (col varchar (65533) charset latin1); ERROR 1118 (42000): Row size too large. the maximum row size for the used table type, not counting BLOBs, It is 65535. this includes des storage overhead, check the manual. you have to change some columns to TEXT or BLOBs mysql> create table max_len_varchar (col varchar (65532) charset latin1); Query OK, 0 rows affected (0.16 sec)

Therefore, in MySQL, the actual storage should start from 2nd bytes.

So far, we can easily get the first answer: (65535-1-2)/3. End this section with an interview question:

Create table t (col1 int (11), col2 char (50), col3 varchar (N) charset = utf8; N Max here? If you are interested, you can calculate it by yourself.

3 timestamp

Let's take a look at the bug of MySQL datetime:

1 2 3 4 5 6 7 8 9 10 11 12 13

mysql> create table t (start_time datetime, stop_time datetime); Query OK, 0 rows affected (0.12 sec) mysql> insert into t (start_time, stop_time) values ("21:46:18", "00:21:31"); Query OK, 1 row affected (0.02 sec) mysql> select start_time, stop_time, stop_time-start_time from t; + duration + | start_time | stop_time-start_time | + duration + | 21:46:18 | 00:21:31 | 787513 | + ----------------------- + ------------------- + duration + 1 row in set (0.00 sec)

Because the datetime type does not support direct calculation, the result is obtained only when the time is converted to a number to subtract. In addition to this bug, we should also try to use timestamp as much as possible. After all, from the storage point of view, timestamp only occupies 4 bytes, compared with datetime (8 bytes) and date (8 bytes) high space efficiency. In addition, some people use int unsigned to store a time value converted to a Unix timestamp, but this does not bring any benefit. MySQL provides from_unixtime () convert a Unix timestamp to a date and a unix_timestamp () to a Unix timestamp. Therefore, we do not need to stick to this habit because timestamp is actually a four-byte INT value, use the default time zone of the system. Different timestamps are obtained for the same string value, which is even worse.

The behavior rules of timestamp are complex, and there will be changes in different versions of MySQL, so sometimes "empirical" will make people kicker, so we should verify that the database behavior is what you need, A good practice is to use the show create table command to check the output after the timestamp column is modified. The following table shows the timestamp of the same DDL statement in different versions.

1 2 3 4 5 6 7 8 9 10 11 12 13

create table t (col timestamp); 5.1: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP layer 5.5 is: 'col' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 5.6 is: 'col' timestamp null default null

It can be seen that the change of timestamp in version 5.6 is earth-shaking.

With the increasingly fierce economic globalization, it is common to put data in different time zones. Mysqldump is the most common option for creating logical backup of data and schema. However, when we open the dump file header, we will find "/*! 40103 SET TIME_ZONE = '+. The Default Time Zone of our client is:

1 2 3 4 5 6 7

mysql> select @ time_zone; + ------------- + | @ time_zone | + ------------- + | SYSTEM | + ------------- + 1 row in set (0.00 sec)

1 2 3 4 5 6 7 8 9 10 11 12 13

mysql> drop table if exists t; mysql> create table t (col timestamp); mysql> insert into t select now (); mysql> select * from t; + --------------------- + | col | + --------------------- + | 10:42:44 | + ----------------------- + 1 row in set (0.00 sec)

$ Mysqldump-uroot-poracle testdb t -- where = 'col = "10:42:44" '| grep INSERT

The returned result is null and no data can be exported? Two solutions are provided below

Method 1 Add the parameter -- tz-utc

$ Mysqldump-uroot-p testdb t -- tz-utc = 0 -- where = 'col = "10:42:44" '| grep INSERT

Method 2 use conversion functions

1 2 3 4 5 6

mysql> select unix_timestamp (col) from t; + ------------------- + | unix_timestamp (col) | + --------------------- + | 1390617764 | + ------------------- +

$ Mysqldump-uroot-p testdb t -- where = 'col = from_unixtime (1390617764) '| grep INSERT

Insert into 't'values ('2017-01-25 02:42:44 ');

(Ii) Data Type Conversion

Basic Principles:

① Do not add a function before the field

For example:

② Do not embed fields into expressions

For example, suppose that I create an index on the Character column, and then:

(Iii) Data Type optimization

Below are some general principles for optimizing data types:

1. Smaller data types are usually better. The simpler the data type, the better.

5. Use enum and set

6. Use int to store IP addresses: inet_aton () and inet_ntoa ()

7. Use decimal instead of float & double

8. MyISAM uses char and InnoDB and varchar.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More