Five suggestions on table structure optimization for MySQL optimization (data type selection is good) _ MySQL

Source: Internet
Author: User
Five suggestions for table structure optimization for MySQL optimization (data type selection is good) bitsCN.com

It is unknown that the database design 3 paradigm, which was regarded as the "Bible" N years ago, has not been fully applied. Here I have sorted out some common optimization techniques for database table structure design, hoping to be useful to you.

Because MySQL databases are Row-based databases, and database IO operations are performed in the form of page (block), that is, if the space occupied by each record is reduced, the number of data rows that can be stored on each page increases, and the number of lines that can be accessed by each IO increases. Conversely, when processing data of the same number of rows, the page to be accessed is reduced, that is, the number of IO operations is reduced, which directly improves the performance. In addition, because our memory is limited, increasing the number of data rows in each page is equal to increasing the cache data volume for each memory block, at the same time, it will increase the probability of data hits in the memory change, that is, the cache hit rate.

1. data type selection
The most time-consuming operation in database operations is IO processing. most database operations spend more than 90% of their time on IO read/write. Therefore, minimizing IO reads and writes can greatly improve the performance of database operations.

We cannot change the data that needs to be stored in the database, but we can spend some time on the data storage method. The following Optimization suggestions on field types are mainly applicable to scenarios with a large number of records and a large amount of data, because refined data type settings may increase maintenance costs, excessive optimization may also cause other problems:

1. digit type:DOUBLE is not a last resort, not only the storage length, but also the accuracy. Similarly, DECIMAL is not recommended for DECIMAL with fixed precision. we recommend that you multiply the DECIMAL number by a fixed multiple to convert it into integer storage, which can greatly save storage space without any additional maintenance costs. For integer storage, when the data volume is large, we recommend that you separate the TINYINT, INT, and BIGINT options, because the storage space occupied by the three also varies greatly, if you are sure that no negative number is used, we recommend that you add the unsigned definition. Of course, if the data size of a database is small, you do not need to strictly distinguish the three integer types.
2. character type:Rather than using the TEXT data type, the processing method determines that its performance is lower than that of the char or varchar type. For a fixed length field, we recommend that you use the CHAR type. the variable length field should use VARCHAR as much as possible, and only set the appropriate maximum length, rather than randomly giving a large maximum length limit, because of the different length ranges, MySQL also has different storage processes.
3. time type:Try to use the TIMESTAMP type, because its storage space only needs half of the DATETIME type. For data types that only need to be accurate to a specific day, we recommend that you use the DATE type because its storage space only needs three bytes, which is less than TIMESTAMP. It is not recommended to store a unix timestamp value through the INT type class, because this is too intuitive, it will bring unnecessary trouble for maintenance, but also will not bring any benefits.
4. ENUM & SET:For the status field, you can try to use ENUM for storage, because it can greatly reduce the storage space, and even if you need to add a new type, as long as it is added to the end, you do not need to recreate table data to modify the structure. What if it is to store pre-defined attribute data? You can try to use the SET type. even if there are multiple attributes, it can be easily used, while saving a lot of storage space.
5. LOB type:We strongly oppose storing LOB type data in databases. Although the database provides such a function, this is not what he is good at. we should make appropriate tools to do what he is good, in order to maximize its potential. Storing LOB data in databases is just like letting a marketing specialist who has learned a little Java at school many years ago write Java code.
II. Character encoding
The character set directly determines how data is stored and encoded in MySQL. because the same content uses different character sets to indicate that the space occupied is significantly different, you can use the appropriate character set, this helps us minimize the amount of data and reduce the number of I/O operations.

1. content that can be expressed only by latin1 characters. it is not necessary to select character encoding other than latin1 because it saves a lot of storage space.
2. if you are sure that you do not need to store multiple languages, you do not need to use UTF8 or other UNICODE character types. this will result in a waste of storage space.
3. mySQL data types can be accurate to fields, so when we need to store multi-byte data in large databases, you can use different data types for different fields in different tables to greatly reduce the data storage capacity, reduce the number of IO operations, and increase the cache hit rate.

III. proper splitting

Sometimes, we may want to map a complete object to a database table, which is good for application development, however, in some cases, it may cause major performance problems.

When a large field similar to TEXT or a large VARCHAR type exists in our table, this field is not required if most of us access this table, instead, we can split it into another independent table to reduce the storage space occupied by common data. One obvious advantage of this is that the number of data records that can be stored in each data block can be greatly increased, which reduces the number of physical IO operations and greatly improves the cache hit rate in the memory.

The above optimization aims to reduce the storage space of each record, so that more records can be stored in each database to reduce the number of IO operations and increase the cache hit rate. The following optimization recommendations may be hard to understand by many developers, because they are a typical anti-paradigm design and are contrary to the above optimization recommendations.

IV. moderate redundancy

Why do we need redundancy? Does this increase the size of each piece of data and reduce the number of records that can be stored in each data block?
Indeed, this will increase the size of each record and reduce the number of data records that can be stored in each record. However, in some scenarios, we still have to do this:

Independent small fields that are frequently referenced and can only be obtained by joining two (or more) large tables
In this scenario, each Join operation is only used to obtain the value of a small field. the Join operation records are large, causing a large number of unnecessary IO, it can be optimized by exchanging space for time. However, while redundancy is required, ensure that data consistency is not damaged, and ensure that redundant fields are updated at the same time.

5. try to use NOT NULL

The NULL type is special and it is difficult to optimize SQL statements. Although there is a difference between MySQL NULL type and Oracle NULL type, it will enter the index, but if it is a composite index, this NULL type field will greatly affect the efficiency of the entire index. In addition, the processing of NULL in the index is special, and it also occupies additional storage space.
Many people think that NULL will save some space, so try to make NULL to save I/O, but this will be counterproductive in most cases. although space may be saved, it brings about many other optimization problems, not only not saving the I/O volume, but increasing the I/O volume of SQL. Therefore, it is a good table structure design and optimization habit to ensure that the DEFAULT value is not NULL.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.