21 Best practices for MySQL performance optimization and MySQL usage index

Last Update:2016-04-07 Source: Internet

Author: User

Tags compact mysql version

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Optimizing queries for query caching

When many of the same queries are executed multiple times, the results of these queries are placed in a cache so that subsequent identical queries do not have to manipulate the table directly to access the cached results.

2. EXPLAIN SELECT Query

Use the EXPLAIN keyword to know how MySQL handles SQL statements. This can query the performance bottleneck of the statement or table structure.

EXPLAIN's query results also index how the primary key is used, how the data table is searched and sorted ... Wait a minute

3. Use LIMIT 1 when only one row of data is used

When querying a table, you already know that the result will only have one result, but because you might need to fetch a cursor, or you might want to check the number of records returned.

In this case, adding LIMIT 1 can increase performance. This way, the MySQL database engine stops searching after it finds a piece of data, instead of continuing to look for the next record-compliant data.

4. Jianjian Index for search words

The index does not necessarily give the primary key or the unique field. If you have a field in your table that you will always use to do a search, then index it.

5. Use a fairly typed example in the Join table and index it

If your application has many join queries, you should confirm that the fields for join in two tables are indexed. In this way, MySQL internally initiates the mechanism of optimizing the SQL statement for the join.

Also, these fields that are used for join should be of the same type

6. Never ORDER by RAND ()

Want to disrupt a returned data row? Randomly pick a data? I don't know who invented this usage, but many novices like it. But you do not understand how horrible the performance problem is.

If you really want to disrupt the data rows that you return, there are n ways you can achieve this. This use only degrades the performance of your database exponentially.

7. Avoid SELECT *

The more data you read from the database, the slower the query becomes. And, if your database server and Web server are two separate servers, this also increases the load on the network transport.

So, you should develop a good habit of taking whatever you need.

8. Always set an ID for each table

We should set an ID for each table in the database as its primary key, and the best is an int type (recommended to use unsigned), and set the automatically added Auto_increment flag.

Even if you have a field in the users table that has a primary key called "email", you don't have to make it a primary key. Use the VARCHAR type to degrade performance when the primary key is used. In addition, in your program, you should use the ID of the table to construct your data structure.

9. Use ENUM instead of VARCHAR

The ENUM type is very fast and compact. In fact, it holds the TINYINT, but it appears as a string on its appearance. In this way, using this field to make a list of options becomes quite perfect.

If you have a field such as "gender", "Country", "nation", "state" or "department", you know that the values of these fields are limited and fixed, then you should use ENUM instead of VARCHAR.

10. Obtaining recommendations from PROCEDURE analyse ()

PROCEDURE analyse () will let MySQL help you analyze your fields and their actual data, and will give you some useful advice. These suggestions will only become useful if there is actual data in the table, because it is necessary to have data as a basis for making some big decisions.

11. Use not NULL where possible

Unless you have a very special reason to use null values, you should always keep your fields not NULL.

Prepared statements

Prepared statements is much like a stored procedure, a collection of SQL statements running in the background, and we can derive many benefits from using Prepared statements, whether it's a performance issue or a security issue.

13. Non-buffered queries

Normally, when you execute an SQL statement in your script, your program will stop there until the SQL statement is returned, and your program continues to execute. You can use unbuffered queries to change this behavior.

14. Save the IP address as UNSIGNED INT

Many programmers create a VARCHAR (15) field to hold IP in the form of a string rather than a shaped IP. If you use plastic to store it, you only need 4 bytes, and you can have a fixed-length field. And, this will bring you the advantage of querying, especially when you need to use such a where condition: IP between Ip1 and IP2.

15. Fixed-length tables are faster

If all the fields in the table are fixed length, the entire table is considered "static" or "Fixed-length". For example, there are no fields of the following type in the table: Varchar,text,blob. As long as you include one of these fields, the table is not a fixed-length static table, so the MySQL engine will handle it in a different way.

16. Vertical Segmentation

"Vertical Segmentation" is a method of turning a table in a database into several tables, which reduces the complexity of the table and the number of fields for optimization purposes.

17. Splitting a large DELETE or INSERT statement

If you need to perform a large DELETE or INSERT query on an online website, you need to be very careful to avoid your actions to keep your entire site from stopping accordingly. Because these two operations will lock the table, the table is locked, the other operations are not in.

18. The smaller the column the faster

For most database engines, hard disk operations can be the most significant bottleneck. So it's very helpful to have your data compact, because it reduces access to the hard drive.

19. Choose the right storage engine

There are two storage engines MyISAM and InnoDB in MySQL, each with a few pros and cons.

MyISAM is suitable for applications that require a large number of queries, but it is not very good for a lot of write operations. Even if you just need to update a field, the entire table will be locked and other processes will be unable to manipulate the read process until the read operation is complete. In addition, MyISAM's calculations for SELECT COUNT (*) are extremely fast.

The InnoDB trend will be a very complex storage engine, and for some small applications it will be slower than MyISAM. He is it supports "row lock", so in the writing operation more time, will be more excellent. Also, he supports more advanced applications, such as: transactions.

20. Using an Object-relational mapper (relational Mapper)

With ORM (Object relational Mapper), you can gain reliable performance gains. All the things an ORM can do, can be written manually. However, this requires a senior expert.

21. Be careful with "permalink"

The purpose of the permanent link is to reduce the number of times the MySQL link is recreated. When a link is created, it will always be in a connected state, even if the database operation is finished. And since our Apache has started reusing its child processes-that is, the next HTTP request will reuse Apache's subprocess and reuse the same MySQL link.

Ref

Http://www.cnblogs.com/daxian2012/articles/2767989.html

http://www.jianshu.com/p/5dd73a35d70f

Basic Configuration
You need to look at the following 3 configuration items frequently. Otherwise, it may soon be a problem.

innodb_buffer_pool_size: This is the first option you should set when you finish installing InnoDB. The buffer pool is where the data and index caches are: the higher the value, the better, which guarantees that you will use memory instead of the hard disk for most of the read operations. Typical values are 5-6GB (8GB memory), 20-25GB (32GB memory), 100-120GB (128GB memory).

innodb_log_file_size: This is the size of the redo log. The redo log is used to ensure that the write operation is fast and reliable and recovers when it crashes. Until MySQL 5.1, it's hard to adjust, because on the one hand you want to make it bigger to improve performance, on the other hand you want it to be smaller to make it faster to recover after a crash. Fortunately, after MySQL 5.5, the performance of crash recovery is greatly improved, so you can have high write performance and crash recovery performance at the same time. Up to MySQL the total size of the 5.5,redo log is limited to 4GB (the default can be 2 log files). This is improved in MySQL 5.6.

Setting the Innodb_log_file_size to 512M in the first place (so there is a 1GB redo log) will give you ample space to write. If you know that your application needs to write data frequently and you are using MySQL 5.6, you can start by turning it into 4G.

max_connections: If you often see the ' Too many connections ' error, it is because the value of max_connections is too low. This is very common because the application does not properly close the database connection and you need a value that is larger than the default number of 151 connections. A major drawback after the Max_connection value is set higher (for example, 1000 or higher) is that it becomes unresponsive when the server is running 1000 or higher active transactions. Using a connection pool in your application or using a process pool in MySQL can help solve this problem.
InnoDB Configuration
Starting with MySQL version 5.5, InnoDB is the default storage engine and is much more used than any other storage engine. That's why it needs to be carefully configured.

innodb_file_per_table: This setting tells InnoDB if the data and indexes of all tables need to be stored in the shared tablespace (innodb_file_per_table = OFF) Or the data for each table is placed separately in an. ibd file (innodb_file_per_table = on). One file per table allows you to reclaim disk space when you drop, truncate, or rebuild tables. This is also necessary for some advanced features, such as data compression. But it doesn't bring any performance gains. The main scenario where you don't want each table to be a file is: There are very many tables (such as 10k+).

In MySQL 5.6, the default value for this property is on, so in most cases you don't need to do anything. For previous versions you must set this property to on before loading the data, because it only affects the newly created table.

innodb_flush_log_at_trx_commit: The default value is 1, which means that InnoDB fully supports acid characteristics. This value is most appropriate when your primary concern is data security, such as on a primary node. However, for a system with slow disk (read-write), it can be costly because additional fsyncs are required each time the change is flush to the redo log. Setting its value to 2 results in less reliable (reliable) because the committed transaction is only flush once per second to the redo log, but it is acceptable for some scenarios, such as the value of the backup node for the primary node. If the value is 0 faster, you may lose some data when the system crashes: only for backup nodes.

Innodb_flush_method: This configuration determines how data and logs are written to the hard disk. In general, if you have a hardware RAID controller and its standalone cache is write-back and has a battery power-down protection, you should set the configuration to O_direct; otherwise, you should make it Fdatasync (the default) in most cases. Sysbench is a great tool to help you decide on this option.

innodb_log_buffer_size: This configuration determines the cache that is allocated for transactions that have not yet been executed. Its default value (1MB) is generally sufficient, but if your transaction contains binary large objects or large text fields, this cache will quickly fill up and trigger additional I/O operations. Look at the innodb_log_waits state variable, if it is not 0, increase the innodb_log_buffer_size.
Other settings
query_cache_size: Query cache is a well-known bottleneck, even when there is not a lot of concurrency. The best option is to deactivate it from the start, set query_cache_size = 0 (now the default for MySQL 5.6) and use other methods to speed up the query: Optimize the index, increase the copy spread load, or enable additional caches (such as memcache or Redis). If you have enabled query cache for your app and have not found any problems, query cache may be useful to you. This is if you want to stop using it, then you have to be careful.

Log_bin: If you want the database server to act as a backup node for the master node, it is necessary to turn on the binary log. If you do this, don't forget to set server_id to a unique value. Even if you have only one server, if you want to do data recovery based on point-in-time, this (turning on binary logging) is also useful: recovering from your most recent backup (full backup) and applying the modifications in the binary log (incremental backup). Once the binary log is created, it is permanently saved. So if you don't want to run out of disk space, you can use PURGE BINARY LOGS to purge old files, or set Expire_logs_days to specify how many days the log will be automatically cleared.

Logging binary logs is not cost-free, so it is recommended to turn off this option if you do not need it on a copy node of a non-primary node.

skip_name_resolve: When a client connects to a database server, the server makes host name resolution, and when DNS is slow, establishing a connection can be slow. Therefore, we recommend that you turn off the skip_name_resolve option when you start the server without DNS lookups. The only limitation is that only IP addresses can be used later in the grant statement, so you must be extra careful in adding this setting to an existing system.

REF:

Http://www.jb51.net/article/47419.htm

Basic principles

1. Join as little as possible

The advantage of MySQL is simplicity, but it's also a disadvantage in some ways. The MySQL optimizer is efficient, but because of its limited amount of statistical information, the optimizer is more likely to deviate from the work process. For a complex multi-table Join, on the one hand due to its optimizer constraints, and also in the Join this aspect of the effort is not enough, so performance from the Oracle and other relational database predecessors still have a certain distance. But if it is a simple single-table query, this gap will be very small even in some scenarios to better than these database predecessors.

2. Sort as few as possible

Sorting operations consume more CPU resources, so reducing the ordering can significantly affect SQL response time in scenarios where the cache hit ratio is high enough for the IO capability.

For MySQL, there are several ways to reduce sorting, such as:

The above myths mentioned in the way of sorting by using the index to optimize

Reduce the number of record entries in the sort

No need to sort data

...

3. Try to avoid select *

Many people find it difficult to understand this point, above is not in the misunderstanding just said that the number of fields in the SELECT clause does not affect the read data?

Yes, most of the time it does not affect the IO volume, but when we have an order by operation, the number of fields in the SELECT clause will largely affect our sorting efficiency, which can be explained by my previous article on MySQL ORDER by The implementation of the analysis of the article has a more detailed introduction.

In addition, the above error is not also said, but most of the time will not affect the IO volume, when our query results only need to be found in the index, it will greatly reduce the amount of IO.

4. Try to use join instead of subquery

While Join performance is poor, there is a significant performance advantage over MySQL subqueries. MySQL's sub-query execution plan has been a big problem, although this problem has existed for many years, but has been released by all the stable version of the widespread, has not been much improved. While the authorities have recognized this issue early and pledged to resolve it as soon as possible, at least we have not yet seen which version of the issue has been better solved.

5. As little or as

When there are multiple conditions in the WHERE clause to "or" coexist, the MySQL optimizer does not have a good solution to its execution plan optimization problem, coupled with MySQL-specific SQL and Storage layered architecture, resulting in poor performance, often using union ALL or U Nion (when necessary) in lieu of "or" will have a better effect.

6. Try to use UNION all instead of union

The difference between Union and union all is that the former needs to merge two (or more) result sets and then perform a unique filtering operation, which involves sorting, adding a lot of CPU operations, and increasing resource consumption and latency. So when we can confirm that it is not possible to duplicate a result set or do not care about repeating the result set, try to use union all instead of union.

7. Filter as early as possible

This optimization strategy is most commonly found in the optimal design of the index (better filter fields are put forward).

This principle can also be used in SQL authoring to optimize some of the Join's SQL. For example, when we are querying multiple tables for paging data, we'd better be able to filter the good data on a single table, and then join with another table with the result set of the page, so as much as possible to reduce unnecessary IO operations, greatly saving the time spent in IO operations.

8. Avoid type conversions

The "type conversion" here refers to the type conversion that occurs when the type of the column field in the WHERE clause is inconsistent with the passed parameter type:

Conversion of human beings on column_name by conversion functions

Directly leads to MySQL (in fact other databases have the same problem) cannot use the index, if not to convert, should be converted on the parameters passed in

Converted by the database itself

If our incoming data types and field types are inconsistent, and we do not have any type conversion processing, MySQL may either make a type conversion operation on our data or leave it to the storage engine to process it, which will cause the execution plan problem if the index is not available.

9. Prioritize high-concurrency SQL rather than low-frequency execution some "big" sql

For the destructive, high concurrency SQL will always be larger than the low frequency, because the high concurrency of SQL once there is a problem, not even give us any respite to the system will be compressed. And for some, although the need to consume a lot of IO and slow response to SQL, because of the low frequency, even if encountered, the most is to let the whole system to respond slowly, but at least for a while, let us have the opportunity to buffer.

10. Optimize from a global perspective, rather than one-sided adjustment

SQL optimization cannot be done separately for one, but should take full account of all SQL in the system, especially when optimizing SQL's execution plan by tuning the index, it must not be forgotten how, pound foolish.

11. Explain the SQL that runs in the database whenever possible

To optimize SQL, you need to be aware of the SQL execution plan to determine if there is room for optimization to determine if there is an execution plan problem. After a period of optimization of the SQL running in the database, it is obvious that SQL may have been scarce, and most of them need to be explored, when a large number of explain operations are needed to collect the execution plan and determine whether optimization is needed.

REF:

Http://www.cnblogs.com/ggjucheng/archive/2012/11/11/2765465.html

Http://blog.chinaunix.net/uid-20639775-id-3154234.html

Http://blog.chinaunix.net/uid-11640640-id-3426908.html

21 Best practices for MySQL performance optimization and MySQL usage index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More