MySQL database table design optimization _ MySQL

Source: Internet
Author: User
Tags mysql manual
MySQL database table design optimization bitsCN.com

1. select the optimized data type

MySQL supports many different data types, and selecting the correct data type is crucial for high performance. No matter what type you choose, the following simple principles will help you make better choices:

(1). smaller is usually better

In general, try to use the minimum type that correctly stores and represents data. Smaller data types are usually faster because they use less disk space, memory, and CPU cache, and require less CPU cycles.

However, it is extremely effort-consuming to increase the range of data types in multiple places in the architecture without being underestimated. If no data type is required, select the minimum type that you think will not be out of the range.

(2). easy to use

The simpler the data type, the fewer CPU cycles are required. For example, the cost of comparing integers is less than that of comparing characters, because character sets and sorting rules make the characters more complex.

(3) Avoid NULL as much as possible)

The field must be defined as not null as full as possible. Even if the application does not need to save NULL, many tables contain columns that can be empty. this is only because it is the default option, unless NULL is saved, otherwise, the column is defined as not null.

It is difficult for MySQL to optimize queries using empty columns, which makes the index, index statistics, and values more complex. An empty column requires more storage space and special processing within MySQL. When an empty column can be indexed, each record requires an additional byte, which can also lead to a fixed-size index in MyISAM (for example, an index on an integer column) to a variable-size index.

Even if you want to store a field that can be empty in the table, there is a way not to use NULL, you can consider using 0, special value or string to replace it.

Changing the NULL column to not null results in little performance improvement. Therefore, you should NOT consider it as a priority optimization unless you are sure that it introduces a problem.If you plan to index a column, try to avoid setting it to NULL)

2. integer

There are two types of numbers: integer and real number. if you store integers, you can use these integer types: tinyint, smallint, mediumint, int, bigint, they require 8, 16, 24, 32, and 64-bit storage respectively.

The integer type has the optional unsigned (unsigned) attribute, which indicates that it cannot be a negative number and roughly doubles the positive limit. for example: tinyint unsigned stores 0 to 255, instead of-127 to 128.

The Signed and unsigned types occupy the same storage space and the same performance. Therefore, the appropriate type can be used according to the actual situation.

Your choice will determine whether MySQL stores data in the memory or on the disk. However, the integer operation usually uses a 64-bit bingint integer.

MySQL also allows you to define the width of the integer type, such as int (11 ). This is meaningless for most applications. it does not limit the value range. it only specifies the number of characters used by mysql interactive tools (such as command clients. For storage computing, int (1) and int (20) are the same.

3. real number

Real numbers have scores. However, they are not just scores. You can use decimal to save integers larger than bigint. MySQL supports both precision and non-precision types.

Float and double types support the use of standard floating point operations for approximate calculation. If you want to know how to implement the floating point operation, you need to implement the floating point on the graduate platform.

Compared to the decimal type, the floating point type usually uses less space to store values of the same size. the float type occupies 4 bytes, and the double type occupies 8 bytes, which has higher precision and wider range. Like integers, you only choose the storage type. Mysql internally calculates the floating point type using double.

Because of the extra space and computing overhead, decimal is used only when decimal numbers need to be precise, such as saving financial data.

4. string type

Varchar and char types

Varchar: stores variable-length strings and is the most commonly used string type. it can occupy less storage space than fixed types, because it only occupies the space required by itself (that is, the space occupied by a shorter value is smaller ). It uses an additional 1-2Bytes to store the length of the value. Varchar can save space, so it is helpful for performance. However, because the length of rows is variable, they may change during update, which will cause extra work. Varchar is usually suitable when the maximum length is much larger than the average length and is rarely updated. At this time, fragmentation will not be a problem, and you can use complex character sets, such as UTF-8, each character may occupy different storage space. MySQL does not remove spaces at the end of the string when accessing Varchar values.

Char: fixed length. MySQL removes spaces at the end of the char access value. CharIt is useful when storing short strings or similar characters.For example, char is suitable for storing the MD5 hash value of a password, and its length is always the same. For frequently changed values, char is better than varchar, because fixed-length rows are not prone to fragmentation. for short columns, char is more efficient than varchar. A Char (1) string occupies only one byte for a single-byte character set, while varchar (1) occupies two bytes because one byte is used to store its length.

The sibling types of Char and varchar are binary and varbinary. they are used to save binary strings. the traditional strings of binary strings are similar, but they are stored in bytes rather than characters. Filling is also different. MySQL uses/0 (0 bytes) to fill the binary value, instead of space, and does not cut off the filled value when obtaining data.

The storage space occupied by "hello" using varchar (5) and varchar (200) is the same, but the use of shorter columns has great advantages, large columns use more memory, because MySQL usually allocates a fixed size of memory block to save the value. This is especially bad for sorting or using temporary tables based on memory. This can also happen when file sorting or disk-based temporary tables are used.

5. BLOB and TEXT types

BLOB and TEXT store a large amount of data in binary and character formats.

In fact, each of them has its own data type family: the character types include tinytext, smalltext, text, mediumtext, and longtext. the binary types include tinyblob, smallblob, blob, medicmblob, and longblob, BLOB is equivalent to smallblob, and TEXT is equivalent to smalltext.

Unlike other types, MySQL treats blob and text as entity objects, which are stored by storage engines. When they are large, InnoDB uses an independent "external" storage for storage. each value must be 1-4 bytes in a row, in addition, sufficient external storage space is required to save the actual values.

The only difference between BLOB and TEXT is that BLOB stores binary data without character sets and sorting rules. TEXT stores character data with character sets and sorting rules.

MySQL sorts BLOB and TEXT columns in a different way than other types. it does not sort the BLOB and TEXT columns according to the length of the string, but only the first several bytes specified by max_sort_length, if you only sort BY the first few characters, you can reduce the value of max_sort_length or use order by substring (column, length ). MySQL cannot index the full length of these data types or use indexes for sorting.

6. use ENUM to replace fixed string types

The ENUM column can store 65535 different strings. MySQL saves them in a very compact manner. MySQL compresses them to 1-2 bytes based on the number of values in the list, mySQL internally stores each value as an integer to indicate the position of the value in the list, and retains a "search table" to indicate the position of the integer and string in the table. the ing relationship in the frm file.

The worst side of Enum is that the string is fixed. if you want to add or delete a string, you must use alter table. Therefore, it is not a good idea to use enum for a series of strings that may be changed due to unknown changes, mySQL uses enum in the internal permission table to save the Y and N values.

Since MySQL saves each value as an integer and needs to be searched to convert it into a string, enum has some overhead. This can usually be compensated by their smaller sizes, but it is not always the case. in a specific case, the char or varchar column and the enum column are joined, it may be slower than joining another chara or varchar column.

7. date and time types

MySQL can use multiple types to save various date and time values. for example, year and date, MySQL can store the finest time granularity in seconds. However, it can perform temporary operations in milliseconds.

MySQL provides two similar data types: DATETIME and TIMESTAMP. for many applications, they can work normally, but in some cases, one is better than the other.

DATETIME: it can store a wide range of values, ranging from January 1, 1001 to January 1, 9999. it encapsulates the date and time into an integer in the format of yyyyMMddHHmmss, regardless of the time zone. It uses 8 bytes of storage space.

TIMESTAMP: the number of seconds since midnight, January 1, January 1, 1970 (Greenwich Mean Time), which is the same as the Unix TIMESTAMP. It uses only four bytes of storage space. Therefore, it is much smaller than the DATETIME range. It indicates that it can start from 1970 to 2038. MySQL provides the FROM_UNIXTIME () function to convert a Unix timestamp to a date, and provides the UNIX_TIMESTAMP () function to convert a date to a Unix timestamp.

The value displayed in TIMESTAMP depends on the time zone. MySQL servers, operating systems, and clients have time zone settings. Therefore, the actual display time of the TIMESTAMP storing the 0 value is 19:00:00, which is 5 hours different from GMT.

TIMESTAMP also has a special property that DATETIME does not have. by default, if the inserted row does not define the value of the TIMESTAMP column, MySQL sets it to the current time. MySQL automatically updates the value of the TIMESTAMP column if it is not explicitly defined during the update process. You can configure the insert and update actions of the TIMESTAMP column. At last, TIMESTAMP is not null by default, which is different from other data types!

8. select an identifier

It is very important to select a data type for the ID column. you may use them more to compare with other columns and use them as foreign keys for other tables, because when you select a data type for the identifier column, you may also select a data type for the relevant table.

When selecting a data type for an identifier column, consider not only the storage type, but also how MySQL calculates and compares them. For example, mysql saves the enum and set types as integers internally, but converts them into strings during comparison.

Once the data type is selected, make sure that the same type is used in the relevant table. Exact match must be performed before the type, including attributes such as unsigned. Mixing different data types can cause performance problems. even if there is no performance problem, implicit type conversion can also lead to imperceptible errors, when you forget that you are comparing different types of data, these errors will suddenly occur.

The smallest data type can indicate the range of values and leave room for growth in the future. For example, if porvince_id is used to represent the province of China, we know that it will not produce thousands of values, because the class does not need to use int, tinyint is enough, it is three characters smaller than int. if the primary key of a table is tinyint and the other table uses int as the foreign key, a large performance gap will occur.

An integer is usually the best choice for identifiers because it is fast and can use auto_increment.

Enum and set are generally not suitable for identifiers, although they are suitable for static purposes, including the "definition table" of the state and "type" and value ".

The Enum and set columns are suitable for fixed information such as gender, country, and province.

Avoid using strings as identifiers as much as possible, because they occupy a lot of space and are generally slower than integer types. do not use string identifiers on myisam tables. Myisam uses a compressed index for Strings by default, which makes searching more slow.

MyISAM uses prefix compression to reduce the index size. by default, strings and integers can be compressed.

You can use PACK_KEYS to control the index compression mode when creating table.

PACK_KEYS is described as follows in the MySQL manual:

If you want the index to be smaller, set this option to 1. This usually slows down the update speed and speeds up reading. Setting the option to 0 can cancel the compression of all keywords. When this option is set to DEFAULT, the storage engine only compresses long CHAR or VARCHAR columns (only for MyISAM ).

If you do not use PACK_KEYS, the default operation is to only compress strings, but not compress numbers. If you use PACK_KEYS = 1, compress the number.

9. Special Data types

Some data types do not directly correspond to built-in data types. the timestamp with a precision lower than the second is an example. Another example is the IP address. people usually use varchar (15) to save the IP address. However, the IP address is actually an unsigned 32-bit integer, not a string. The decimal point is used to increase readability. In actual use, the application uses an unsigned integer to store IP addresses. MySQL provides the INET_ATON () and INET_NTOA () functions to convert IP addresses and integers.


BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.