MySQL optimization Overview

Last Update:2018-12-05 Source: Internet

Author: User

Tags table definition

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Optimize SQL

1) Use show status to learn the execution frequency of various SQL statements.

Show status like 'com _ %'
How many times Com_select and Com_insert are executed
2) analyze inefficient SQL statements through Explain
3) create an appropriate index
4) use show status like 'handler _ % 'to view the index usage
Handler_read_key: Number of requests to read rows based on the index. If the value is large, it indicates that your queries and tables have a good index, indicating that the index efficiency is high.
Handler_read_rnd_key: Number of requests to read rows at a fixed position. This value is very high if you execute many queries that require sorting. You may have a lot of queries that require full table scanning, Or you use incorrect indexes for multi-table queries.

Handler read rnd next: Number of requests to read rows from data files. If you are scanning many tables, this value will be very large. This usually means that your table is not indexed, or your query statement does not use an index field.

5) Regular analysis tables and checklists
Analyze table test_table and check table test_table
Check whether the value of the Msg_text field is OK.
6) regularly optimize the optimize table test_table
If many changes have been made to variable fields such as varchar blob and text, OPTIMIZE is applied.

In most settings, you do not need to run optimize table. Even if you have made a large number of updates to a variable-length row, you do not need to run it frequently, once a week or once a month, and only run on a specific table.
Optimize table only applies to MyISAM, BDB, and InnoDB tables.
For MyISAM tables, OPTIMIZE tables are operated as follows:
If the table has been deleted or broken down, the table is repaired.
If the index page is not classified, the page is classified.
If the statistical data of the table is not updated (and cannot be repaired by classifying the index), update the table.
7) Optimize order by orgroup by and so on

Details: SQL optimization Overview

2. Optimize database objects

1) Select a suitable storage engine for the table:

MyISAM:The application mainly uses read and insert operations. There are only a few updates and deletions, and the transaction integrity and concurrency requirements are not very high.

Innodb:Transaction processing and data consistency under concurrent conditions. In addition to insertion and query, there are many updates and deletions. (Innodb effectively reduces the lock caused by deletion and update ). For InnoDB tables that support transactions, the main cause of the impact on the speed is that AUTOCOMMIT is enabled by default, and the program does not explicitly call BEGIN
Start the transaction. As a result, each inserted entry is automatically committed, seriously affecting the speed. You can call begin before executing the SQL statement. Multiple SQL statements form a transaction (even if autocommit is enabled), which greatly improves the performance.

Memory:Data is saved in RAM to quickly access data. It is required that the table cannot be too large or that data does not need to be restored after the mysql exception is terminated.

Merge:
2) Optimize the table data type and select the appropriate data type:

Principle: smaller is usually better and easier. All fields must have default values. Avoid null as much as possible:

For example, when designing a database table, use smaller integer types as much as possible to occupy disk space (mediumint is more suitable than int)

For example, the time field: datetime, timestamp, datetime occupies 8 bytes, while timestamp occupies 4 bytes and only uses half. The timestamp indicates that the range is, which is suitable for update time.

MySQL can support access to large data volumes, but generally, the smaller the table in the database, the faster the query will be executed on it. Therefore, when creating a table, we can set the field width in the table as small as possible to achieve better performance. For example, if you set it to Char (255) when defining the zip code field, it is obvious that unnecessary space is added to the database, and even the varchar type is redundant, because char (6) can well complete the task. Similarly, if possible, we should use mediumint instead of bigin to define integer fields.
Another way to improve efficiency is to set the field to not null whenever possible, so that the database does not need to compare null values during future queries.
Some text fields, such as "Province" or "gender", can be defined as enum. In MySQL, The enum type is processed as the numeric data, and the numeric data is processed much faster than the text type. In this way, we can improve the database performance.

3) string data types: char, varchar, and text
(1) differences in length, Char range is 0 ~ 255, varchar can be up to 64 K, but note that 64 K here is the length of the entire row. Consider other columns, and if not null exists, it will also occupy one space, for different character sets, the effective length is also different, such as utf8, a maximum of 21845, but also to remove other columns, but varchar is generally enough to store. If you encounter large text, consider using text, up to 4 GB.

(2) EfficiencyIt is basically char> varchar> text, but if InnoDB engine is used, we recommend that you use varchar instead.

(3) Default ValueCharchar and varchar can have default values, and text cannot specify default values

4) float data type issues in MySQL

(1) FLOAT or DOUBLE columns can be compared with numeric values. Equality (=) cannot be used for comparison. This is because of the precision of floating point numbers, which may produce errors.

(2) Data sensitive to precision, such as currency, should be expressed or stored in Fixed Points

It is necessary to select an appropriate data type for the database, which has a certain impact on the performance. Here there are two pieces of records. For int type, if you do not need to access negative values, you 'd better add unsigned. For fields that often appear in where statements, consider adding indexes, integer Data is especially suitable for indexing.

5) in the InnoDB data table design, pay attention to the following points:

1. explicitly define the primary key of an INT-Type Auto-increment field. This field can only be used as the primary key and is not used for other purposes.
2. If the primary key is not explicitly defined, InnoDB may need to sort new data rows each time, seriously compromising performance.
3. Do not update or modify the primary key fields to prevent changes in the primary key fields, resulting in data storage fragmentation and IO performance reduction.
4. If you need to update the primary key field, convert it into a unique index constraint field, and create an auto-incrementing field without any other business significance as the primary key.
5. The primary key field type should be as small as possible. INT Is not needed when SMALLINT is used, and BIGINT is not used when INT is used.
6. Put the primary key field in the first order of the data table

3. Optimize Indexes

Index is a common method to improve database performance. It allows database servers to retrieve specific rows at a much faster speed than no index, especially when a query statement contains MAX (), when using the MIN () and ORDERBY commands, the performance improvement is more obvious.

Which fields should be indexed?? In general, the index should be built on the fields that will be used for join, where judgment and order by sorting. Try not to index a field in the database that contains a large number of repeated values. For an Enum type field, it is very likely that a large number of repeated values appear, such as "Province" in customerinfo ".. it is not helpful to create an index on such a field. On the contrary, it may also reduce the performance of the database. When creating a table, we can create an appropriate index at the same time, or use alter
Table or create index later

1). Common Index

The only task of a common index (index defined by the key or index keyword) is to speed up data access. Therefore, the query condition (where column =…) must be ...) Or create an index for the data column in the order by column. If possible, you should select the most neat and compact data column (such as an integer data column) to create an index.

2). Unique Index

Normal indexes allow indexed data columns to contain duplicate values. For example, because a person may have the same name, the same name may appear twice or more times in the same "employee profile" data table.
If you can determine that a data column will only contain different values, you should use the keyword unique to define it as a unique index when creating an index for this data column. The advantage of doing so: First, it simplifies MySQL's management of this index, and this index becomes more efficient. Second, MySQL inserts a data table with a new record, automatically checks whether the value of this field of the new record has already exists in this field of a record; if yes, MySQL rejects the insert of that new record. That is to say, the unique index can ensure the uniqueness of data records. In fact, in many cases, the purpose of creating a unique index is not to speed up access, but to avoid data duplication.

3). Primary Index

I have already repeatedly stressed that an index must be created for the primary key field. This index is called a "primary index ". The only difference between a primary index and a unique index is that the keywords used by the former during definition are primary rather than unique.

4). Foreign Key Index

If a foreign key constraint is defined for a foreign key field, MySQL defines an internal index to help you manage and use foreign key constraints in the most efficient way.

5). Composite Index

Indexes can cover multiple data columns, such as INDEX (columnA, columnB) indexes. This index features that MySQL can selectively use such an index. If you only need to use an INDEX on the columnA data column for the query operation, you can use a composite INDEX (columnA, columnB ). However, this method is only applicable to the combination of data columns in the composite index. For example, INDEX (A, B, C) can be used as an INDEX of A or (A, B), but cannot be used as B, C or (B, C).

4. Table lock problems

The most important difference related to performance is that the lock mechanism implemented by MyISAM and InnoDB is different! MyISAM uses table locks, while InnoDB implements row locks.

1) MyISAM is a table-Level Lock.

Because the MyISAM write process gets the lock first, the read lock request is pulled back to the queue. Not only that, even if the Read Request first goes to the lock wait queue, after the write request arrives, the write lock will be inserted before the read lock request! This is because MySQL considers that write requests are generally more important than read requests.
In the case of a large number of update operations, it is difficult to obtain the read lock. This causes blocking.
Therefore, MyIsam is not suitable for many update operations.

2) INNODB row locks are implemented based on indexes.Innodb uses table locks if data is not accessed through indexes.

Table-level locks are more suitable for query-oriented applications where only a small amount of data is updated based on index conditions.

Row-level locks are more suitable for a large number of concurrent updates to a small amount of different data based on index conditions and concurrent queries. Because only the row to be operated is locked, multiple threads can operate on different rows at the same time (as long as other threads do not operate on the rows already locked ).

5. MySQL server Configuration Optimization

1) Use show variables to understand server Parameters
2) show status: Check the running status of the server, such as lock wait status and number of current connections.
3) important parameters that affect mysql performance:

Key_buffer_size sets the cache size of the index block: key_buffer_size is a parameter that has the greatest impact on the performance of the MyISAM table.

Pass:

Mysql> show global status like 'key _ read % ';
+ ------------------- + ------------ +
| Variable_name | Value |
+ ------------------- + ------------ +
| Key_read_requests | 3465117712 |
| Key_reads | 624 |
+ ------------------- + ------------ +

Key_read_requests: Number of requests that read the index from the cache.
Key_reads: the number of requests that read the index from the disk.

Generally, people think that the larger the value of Key_read_requests/Key_reads, the better.

Increase key_buffer_size

The number of opened table caches in the table_cache database. Each connection will open at least one table cache. Therefore
Table_cache is related to max_connections. For example, for 200 concurrent connections, the table cache should be at least 200 * N
N is the maximum number of tables in a join that can be queried.

4) innodb_buffer_pool_size and other innodb parameter settings

6. Disk IO Optimization

For our database optimization, disk I/O optimization is the second-to-second optimization focus. We all know the working principle of the barrel, and the short board is definitely good or bad as a whole, this shortcoming in the database system is caused by the weakest disk in the hardware device we use. Most of the time, we will find that I/O in the system is exhausted, while the CPU is idle and waiting, mainly because the I/O execution response time is too long, the read/write processing speed lags far behind the CPU processing speed. At this time, we will try our best to put operations in the memory, from the relationship between disk and CPU to the relationship between memory and CPU. However, we cannot avoid the weakness of disk I/O, and optimization is necessary.

Disk search is a huge performance bottleneck. When the data size becomes so large that the cache performance becomes invalid, this problem becomes more obvious. For large databases, where you access data more or less randomly, you can believe that read operations require at least one hard disk search, and write operations require multiple hard disk searches. To minimize this problem, you should use a disk with a small number of searches.

1) use disk array RAID (Redundant Array of cheap disks)

RAID distributes data to several physical disks according to certain policies. This not only enhances the reliability of data storage, but also improves the performance of Data Reading and Writing (RAID has a level that is not supported)
1) frequent read/write and high reliability requirements, preferably RAID 10
2) if data is frequently read and less written, select RAID 5 for reliability requirements.

3) data is read and written frequently, but RAID 0 can be selected if the reliability requirement is not high.

2) use Symbolic Links to distribute I/O

By default, MYSQL stores databases and data tables in the directory defined by datadir. If RAID or logical volume is not used, all data is stored on a disk device, it cannot take advantage of Multi-disk parallel read/write.

The table and database can be moved from the database directory to another location and replaced with a symbolic link pointing to the new location. The recommended method is to direct the database to different disks through symbolic links. The symbolic chain table is only used as the final method.

To connect a database with a symbolic link, first create a directory on some hard disks with free space, and then create a symbolic link from the MySQL data directory.

For example:

$ mkdir /dr1/databases/test$ ln -s /dr1/databases/test /path/to/datadir

Note: Only MyISAM tables fully support symbolic links. For other table types, it may be strange to try to use any previous statement to use Symbolic Links in files in the operating system.
The symbolic links in the MyISAM table are processed as follows:
1. In the data directory, there must be a table definition file, data file, and index file. Data Files and index files can be moved elsewhere and replaced by symbolic links in the data directory. The table definition file cannot be replaced by symbolic links.
2. You can use Symbolic Links to direct data files and index files to different directories.

3. If mysqld is not running, you can use ln-s to manually run the symbolic link from the server command line. Similarly, by using the data directory and index directory options to create a table, you can instruct the running MySQL server to execute the symbolic link.
4. myisamchk does not need to replace symbolic links with data files or index files. It works directly on the file pointed to by the symbolic link. Any temporary file is created in the directory where the data file or index file is located.
5. Note: When you delete a table, if the table uses a symbolic link, the symbolic link and the file pointed to by the symbolic link are deleted. This is why you should not run mysqld as the system root user or allow the System user to have write access to the MySQL database directory.
6. if you use alter table... rename: rename a table without moving the table to another database. the symbolic link in the database directory is renamed as a new name and the data file and index file are renamed accordingly.
7. If you use alter table... Rename to move a table to another database, the table will be moved to another database directory. The old symbolic link and the file it points to are deleted. In other words, the new table is no longer linked.
8. If you do not use symbolic links, you should use the -- skip-symbolic-links option for mysqld to ensure that no one can use mysqld to delete or rename files outside the data directory.

Table symbolic links do not support the following operations:
1. alter table ignore the data directory And index directory table options.
2. backup table and restore table do not consider symbolic links.
3. frm files must not be a symbolic link (as described above, only data and index files can be symbolic links ). If you try to do this (for example, generate a symbolic link), it will produce incorrect results.

3) prohibit the operating system from updating the atime attribute of Files

7. Application Optimization

1) Use the connection pool
For database access, establishing a connection is expensive. Therefore, it is necessary to establish a "connection pool" to Improve the access performance. We can regard the connection as an object or device, and there are many established connections in the pool. The locations that originally needed to connect to the database are changed to connected to the pool, the Pool temporarily allocates connections for access. After the result is returned, the access will be returned.

2) reduce access to mysql and use mem cache.

3) Server Load balancer, copy and distribute query operations
Use mysql master-slave replication to distribute update and query operations
1) create a replication account: Gran replication slave on *. * to 'rel '@ '10. 0.1.2 'identified by '123'
2), modify the configuration of the master server my. conf to enable binlog and set the server-id
3) restore the Data Consistency of the master server to the slave server to ensure that the data to be copied is unique; otherwise, the problem occurs.
4), modify the configuration my. conf on the slave server
Server-id = 2
Master-host = 10.0.1.3
Master-user = 'rel'
Master-password = '000000'
Master-port = '000000'

5), start the slave thread from the server: start slave
Show processlist.

4) distributed cluster database architecture

8. Split the table

1) horizontal division
If a table contains too much data and is expected to have more than million data records, we can convert it to 0: Split the table.
The table splitting algorithm is involved here:
The table that records logs can also be split by week or month.
The table that records user information is split by the hash algorithm of the user ID.

2) Vertical Split
If the number of table records is not large, there may be 2 or 30 thousand records, but the field is very long, the table occupies a large amount of space, and a large amount of I/O needs to be executed during table retrieval, seriously reducing the performance. In this case, you need to split the large field into another table, and the table has a one-to-one relationship with the original table.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More