Losing weight to data makes MySQL database run faster

Source: Internet
Author: User

In database optimization, it is one of the most common and effective methods to minimize the data size and minimize the space occupied by tables on the hard disk. Because of data reduction, the read/write speed of the hard disk can be improved, and the system resources occupied by processing the content of small tables during the query process are relatively small. Similarly, if an index is set on a relatively small column, the index occupies less resources. So how can database administrators lose weight for their own data? I have the following suggestions.

Recommendation 1: NULL values do not necessarily occupy space.

Here, I will give you literacy first. Some database administrators think that null values do not occupy system resources. In fact, this is a wrong idea. During database design, they do NOT like to set the attribute of a field to not null. Users can input data according to their own needs. I believe that this approach is detrimental to the performance of the database.

In my opinion, if possible, try to set the column to not null, that is, NULL values are NOT allowed. In this way, we can speed up subsequent processing. In terms of data storage, we can also save one bit for each column, so as to achieve the goal of data loss. In actual work, if you do not need to input data in some cases, you can use the default field to achieve non-null purpose. For example, in the salary system, the user's working life can be set to 0 by default, rather than blank. Of course, if NULL is required, there is no way. As a Database Engineer, do not use NULL values.

Recommendation 2: Use data types as small as possible

The size of the Data Type also affects the size of the base table. For example, MEDIUMINT and INT data types can be used to store integer data, but they can only store different precision. However, from the perspective of data storage, the former requires about 25% less storage space than the latter. Therefore, do not use an INT when MEDIUMINT can be used.

In addition, when defining the data length, it should be as short as possible to meet the requirements. For example, there is an employee code field in the salary assessment system. If the enterprise employee code is determined, it consists of five characters. When defining a field, you only need to define the length of 5 characters. This not only reduces the storage space, but also provides some data verification functions. When the encoding length entered by the user exceeds 5 characters, the data cannot be saved.

Although you can choose from many data types to save a data, you can also define a large number of characters. However, selecting a data type as small as possible can help reduce the data storage space and reduce data loss. This further improves the database performance.

Recommendation 3: Relationship between indexes and data table size

I mentioned at the beginning of this article that if you set an index for a small column, the index will also occupy less resources. It can be seen that indexes are closely related to the data table size. Setting an appropriate index at the right place and at the right time can also reduce the amount of data.

Under normal circumstances, each data table may have multiple indexes, but the primary index usually has only one. Therefore, the primary index of each table should be as short as possible. This can help the database to quickly identify.

Try to index the prefix as much as possible. If you have a table, you need to set an index for a column. This column has a unique prefix on the first few characters. If this is the case, it will be better to index the prefix rather than all. In MySQL databases, an index can be created on the leftmost part of a character column. This means that the database Splits a field into two parts based on certain rules. If the data in the previous part can be unique after the split, you only need to set an index for the previous part, instead of setting an index for the data in the entire field. This can undoubtedly reduce the resources occupied by indexes and achieve the goal of losing weight. Shorter indexes provide faster query speeds. Because they occupy less disk space, and they will save more access in the index cache. This reduces the number of hard disk searches and improves query efficiency.

Note that indexes cannot be abused. Using indexes can indeed improve data processing capabilities, but indexing also brings additional costs. Only when this benefit exceeds the overhead can indexes be used to improve database performance. Otherwise, it will have the opposite effect. For example, if a table needs to be quickly stored and too many indexes are set on the table, the index will play a side effect. In this regard, the author suggests that if you mainly access a table through a combination of search columns, it is best to set only one index for them. Of course, this index should be the most common column in daily work. If you need to use multiple indexes as a last resort, it is best to use columns with more replicas for better index compression. This reduces the resource consumption caused by the use of multiple indexes.

Suggestion 4: it cannot be saved in areas requiring "fullness"

A woman, the thin place should be thin, and the plump place should be full. The same is true for databases. The hard disk space is saved. However, it cannot be reduced in order to lose weight. Sometimes this will be counterproductive.

Take Varchar as an example. For example, if the MyISAM label does not contain any variable-length columns, it is best to use a fixed-size data type. Although a fixed-length data type is used, a certain amount of storage space is often wasted. If the user inputs insufficient data and uses a fixed length, the data will still be stored according to the fixed length. In this case, a fixed length can be used. In this case, although a certain amount of hard disk space is wasted, the data query speed can be improved.

It can be seen that not losing weight on data in any situation can improve the database performance. This is like open-source node management, which must be saved to the cutting edge. Otherwise, it will not only be unable to save money, but also lift a stone and smash its own feet. In layman's terms, the thin area should be thin, and the plump area should be full. Remember this sentence.

Recommendation 5: Split the table to achieve the goal of losing weight

If a food item is too large to be moved during ant financial's food migration, ant financial may split the item until it was moved. This is the principle of cake splitting. In fact, this phenomenon is often seen in daily work. For example, if we have a database table with a large number of records, the table's allowable speed will be very slow. In this case, tables can be divided into multiple workbooks according to certain rules. For example, there is an employee's attendance information. The wait time for this table to be queried, sorted, and counted is very long. In this case, you can divide the workbook into different workbooks Based on the department, and then analyze the relevant data. At this time, although the workload will increase, the processing speed will be much faster.

Based on this principle, it is very helpful to split a frequently scanned large table into two or more tables during database optimization. For example, in my daily work, I have a data table in dynamic format, and this data is used to scan a table, this will be used to find a table with a relatively small static format for the relevant rows.

By splitting this table, you can divide a big cake into several small cakes to facilitate subsequent data statistics and analysis. Of course, the effect is directly related to the sharding rule. This is another big topic about how tables can be split to achieve the desired results. Due to the limited space, I will not explain it too much. Maybe in the subsequent articles, I will start with this proposition and give you a detailed description.

Source: http://publish.itpub.net/a2011/0302/1161/000001161945.shtml

  1. Two ways to improve the query efficiency of Order by statements
  2. MySQL database storage engine and branch status
  3. MySQL database partition management Details determine success or failure
  4. How Order By is implemented in MySQL
  5. Required by DBA: Common MySQL database operations and skills

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.