20 + sets of best MySQL Performance Optimization experience bitsCN.com
20 + sets of best MySQL Performance Optimization experience
Today, database operations are increasingly becoming the performance bottleneck of the entire application, especially for Web applications. Concerning the database performance, this is not just something that DBAs need to worry about, but it is something that our programmers need to pay attention. When designing the database table structure and operating the database (especially the SQL statements used in table queries), we need to pay attention to the performance of data operations. Here, we will not talk about the optimization of many SQL statements, but only for the database with the most Web application MySQL. I hope the following optimization techniques will be useful to you.
1. optimize your query for the query cache
Query cache is enabled on most MySQL servers. This is one of the most effective ways to improve performance, and it is processed by the MySQL database engine. When many identical queries are executed multiple times, these query results are stored in a cache, the cache results are directly accessed for the same query in the future without having to operate the table.
The main problem here is that this is easy for programmers to ignore. Because some of our query statements will make MySQL not use cache. See the following example:
// Query cache disabled
$ R = mysql_query ("SELECT username FROM user WHERE signup_date> = CURDATE ()");
// Enable query cache
$ Today = date ("Y-m-d ");
$ R = mysql_query ("SELECT username FROM user WHERE signup_date> = '$ today '");
The difference between the preceding two SQL statements is CURDATE (). The query cache of MySQL does not work for this function. Therefore, SQL functions such as NOW (), RAND (), and other such functions do not enable the query cache, because the returned results of these functions are variable. Therefore, all you need is to use a variable to replace the MySQL function and enable the cache.
2. EXPLAIN your SELECT query
The EXPLAIN keyword helps you know how MySQL processes your SQL statements. This helps you analyze the performance bottleneck of your query statement or table structure.
The EXPLAIN query results also show you how your index primary key is used and how your data tables are searched and sorted ...... And so on.
SELECT one of your SELECT statements (we recommend that you SELECT the most complex one with multi-table join) and add the keyword "EXPLAIN" to the front. You can use phpmyadmin to do this. Then, you will see a table. In the following example, we forget to add the group_id index and have table join:
After adding an index to the group_id field:
We can see that the previous result shows that 7883 rows are searched, and the last one only searches 9 and 16 rows of two tables. Viewing the rows column allows us to find potential performance problems.
3. use LIMIT 1 when only one row of data is required
When you query a table, you know that only one result is returned, but you may need to fetch the cursor or check the number of returned Records.
In this case, adding LIMIT 1 can increase performance. In this way, the MySQL database engine will stop searching after finding a piece of data, rather than continuing to query the next piece of data that matches the record.
The example below is just to find out if there are "China" users. Obviously, the latter will be more efficient than the former one. (Note that Select * is the first and Select 1 is the second)
// Inefficient:
$ R = mysql_query ("SELECT * FROM user WHERE country = 'China '");
If (mysql_num_rows ($ r)> 0 ){
//...
}
// Efficient:
$ R = mysql_query ("SELECT 1 FROM user WHERE country = 'China' LIMIT 1 ");
If (mysql_num_rows ($ r)> 0 ){
//...
}
4. create an index for a search field
An index is not necessarily a primary key or a unique field. If a field in your table is always used for search, create an index for it.
You can see the search string "last_name LIKE 'a % '". one is an index, the other is no index, and the performance is about 4 times worse.
In addition, you should also know what kind of search cannot use normal indexes. For example, if you need to search for a word in a large article, for example, "WHERE post_content LIKE '% apple %'", the index may be meaningless. You may need to use MySQL full-text indexes or make an index yourself (for example, search for keywords or tags)
5. use an equivalent type of example when joining a table and index it
If your application has many JOIN queries, you should confirm that the Join fields in the two tables are indexed. In this way, MySQL will launch a mechanism to optimize the Join SQL statement for you.
In addition, these fields used for Join should be of the same type. For example, if you want to Join a DECIMAL field with an INT field, MySQL cannot use their indexes. For those STRING types, the same character set is required. (The character sets of the two tables may be different)
// Find company in state
$ R = mysql_query ("SELECT company_name FROM users
Left join companies ON (users. state = companies. state)
WHERE users. id = $ user_id ");
// The two state fields should be indexed and of the same type, the same character set.
6. never order by rand ()
Want to disrupt the returned data rows? Pick a random data? I really don't know who invented this method, but many new users like it. But you do not know how terrible the performance is.
If you really want to disrupt the returned data rows, you have N methods to achieve this purpose. This only causes an exponential decline in the performance of your database. The problem here is that MySQL will not execute the RAND () function (which consumes CPU time), and this is to record rows for each row of records, and then sort them. Even if you use Limit 1, it will not help (because you want to sort)
The following example selects a random record.
// Do not do this:
$ R = mysql_query ("SELECT username FROM user order by rand () LIMIT 1 ");
// This will be better:
$ R = mysql_query ("SELECT count (*) FROM user ");
$ D = mysql_fetch_row ($ r );
$ Rand = mt_rand (0, $ d [0]-1 );
$ R = mysql_query ("SELECT username FROM user LIMIT $ rand, 1 ");
7. avoid SELECT *
The more data you read from the database, the slower the query. In addition, if your database server and WEB server are two independent servers, this will increase the network transmission load.
Therefore, you should develop a good habit of taking what you need.
// Not recommended
$ R = mysql_query ("SELECT * FROM user WHERE user_id = 1 ");
$ D = mysql_fetch_assoc ($ r );
Echo "Welcome {$ d ['username']}";
// Recommended
$ R = mysql_query ("SELECT username FROM user WHERE user_id = 1 ");
$ D = mysql_fetch_assoc ($ r );
Echo "Welcome {$ d ['username']}";
8. always set an ID for each table
We should set an ID for each table in the database as its primary key, and the best is an INT type (UNSIGNED is recommended), and set the AUTO_INCREMENT flag automatically added.
Even if your users table has a primary key field "email", you should not make it a primary key. When the VARCHAR type is used, the primary key performance decreases. In addition, in your program, you should use the table ID to construct your data structure.
In addition, some operations in the MySQL data engine require primary keys. in these cases, the performance and settings of primary keys become very important, such as clusters, partitions ......
Here, there is only one exception, that is, the "foreign key" of the "joined table". that is to say, the primary key of the table is composed by the primary keys of several individual tables. We call this a "foreign key ". For example, if there is a "student table" with a student ID and a "course table" with a course ID, then the "student table" is "join table, it is associated with the student table and curriculum. in the student list, the student ID and course ID are called "foreign keys", which form a primary key.
9. use ENUM instead of VARCHAR
The ENUM type is extremely fast and compact. In fact, it stores TINYINT, but its appearance is displayed as a string. In this way, it is quite perfect to use this field for some option lists.
If you have a field, such as "gender", "country", "nationality", "status", or "department", you know that the values of these fields are limited and fixed, then, you should use ENUM instead of VARCHAR.
MySQL also has a "suggestion" (see Article 10) to tell you how to reorganize your table structure. When you have a VARCHAR field, we recommend that you change it to the ENUM type. You can get related suggestions using procedure analyse.
10. get advice from procedure analyse ()
Procedure analyse () will allow MySQL to help you analyze your fields and actual data, and give you some useful suggestions. Only when the table has actual data can these suggestions become useful, because to make some big decisions, we need data as the basis.
For example, if you create an INT field as your primary key, but there is not much data, procedure analyse () suggests that you change the field type to MEDIUMINT. Or you use a VARCHAR field. because there is not much data, you may get a suggestion that you change it to ENUM. These suggestions may be due to insufficient data, so decision-making is not accurate enough.
In phpmyadmin, you can click "Propose table structure" when viewing the table to view these suggestions.
Be sure to note that these are just suggestions. these suggestions will become accurate only when there is more and more data in your table. Remember that you are the final decision maker.
11. try to use NOT NULL
Unless you use the NULL value for a special reason, you should always keep your field not null. This seems a bit controversial. please refer to it.
First, ask yourself what is the difference between "Empty" and "NULL" (if it is an INT, it is 0 and NULL )? If you think there is no difference between them, you should not use NULL. (Do you know? In Oracle, the strings of NULL and Empty are the same !)
Do not think that NULL requires no space. it requires additional space. In addition, when you compare, your program will be more complex. Of course, this does not mean that you cannot use NULL. The reality is very complicated. in some cases, you still need to use NULL values.
The following is an excerpt from MySQL's own document:
"NULL columns require additional space in the row to record whether their values are NULL. For MyISAM tables, each NULL column takes one bit extra, rounded up to the nearest byte ."
12. Prepared Statements
Prepared Statements is similar to a stored procedure. it is a collection of SQL statements running in the background. we can get a lot of benefits from using prepared Statements, whether it is a performance issue or a security issue.
Prepared Statements can check some variables that you have bound to protect your program against "SQL injection" attacks. Of course, you can also manually check your Variables. However, manual checks are prone to problems and are often forgotten by programmers. When we use some frameworks or ORM, this problem will be better.
In terms of performance, when the same query is used multiple times, this will bring you considerable performance advantages. You can define some parameters for these Prepared Statements, while MySQL only parses them once.
Although the latest version of MySQL uses binary data to transmit Prepared Statements, this makes Network Transmission very efficient.
Of course, in some cases, we need to avoid using Prepared Statements because it does not support Query caching. However, it is said that version 5.1 is supported.
To use prepared statements in PHP, you can view its User Manual: mysqli extension or use the database abstraction layer, such as PDO.
// Create prepared statement
If ($ stmt = $ mysqli-> prepare ("SELECT username FROM user WHERE state =? ")){
// Bind parameters
$ Stmt-> bind_param ("s", $ state );
// Execute
$ Stmt-> execute ();
// Bind the result
$ Stmt-> bind_result ($ username );
// Move the cursor
$ Stmt-> fetch ();
Printf ("% s is from % s/n", $ username, $ state );
$ Stmt-> close ();
}
13. unbuffered queries
Normally, when you execute an SQL statement in your script, your program will stop there until this SQL statement is not returned, then your program continues to run. You can use unbuffered queries to change this behavior.
The PHP document provides a very good description: mysql_unbuffered_query () function:
"Mysql_unbuffered_query () sends the SQL query to MySQL without automatically fetching and buffering the result rows as mysql_query () does. this saves a considerable amount of memory with SQL queries that produce large result sets, and you can start working on the result set immediately after the first row has been retrieved as you don't have to wait until the complete SQL query has been completed Med."
In the above sentence, mysql_unbuffered_query () sends an SQL statement to MySQL, instead of automatically fethch and caching results like mysql_query. This will save a lot of memory, especially the query statements that produce large amounts of results, and you don't have to wait until all the results are returned, you can start to work on the query results immediately when the first row of data is returned.
However, there are some restrictions. Because you either read all rows or call mysql_free_result () to clear the result before the next query. In addition, mysql_num_rows () or mysql_data_seek () cannot be used. Therefore, whether to use a non-buffered query requires careful consideration.
14. Save the IP address as an UNSIGNED INT
Many programmers will create a VARCHAR (15) field to store the IP address in string format instead of an integer IP address. If you use an integer to store data, you only need 4 bytes and you can have a fixed length field. In addition, this will bring you query advantages, especially when you need to use the WHERE condition: IP between ip1 and ip2.
We must use the unsigned int, because the IP address will use the entire 32-bit UNSIGNED integer.
For your query, you can use INET_ATON () to convert a string IP address into an integer, and use INET_NTOA () to convert a string IP address into a string IP address. In PHP, such functions as ip2long () and long2ip () are also available ().
$ R = "UPDATE users SET ip = INET_ATON ('{$ _ SERVER ['remote _ ADDR']} ') WHERE user_id = $ user_id ";
15. tables with a fixed length will be faster
If all the fields in the table are "fixed length", the entire table will be considered as "static" or "fixed-length ". For example, the table does not have the following types of fields: VARCHAR, TEXT, BLOB. As long as you include one of these fields, this table is not a "static table with a fixed length". In this way, the MySQL engine will use another method for processing.
A fixed-length table improves performance because MySQL searches faster, because these fixed-length tables are easy to calculate the offset of the next data, so reading will naturally be fast. If the field is not fixed, the program needs to find the primary key for each query.
In addition, tables with a fixed length are more easily cached and rebuilt. However, the only side effect is that a field with a fixed length will waste some space, because a field with a fixed length will be allocated so much space no matter you use it.
Using the vertical split technique (see the next one), you can split your table into two tables with a fixed length and one with an indefinite length.
16. vertical segmentation
Vertical segmentation is a way to convert tables in the database into several tables by column, which can reduce the complexity of the table and the number of fields, so as to achieve optimization. (I used to work on projects in a bank. I have seen more than 100 fields in a table, which is terrible)
Example 1: In the Users table, a field is the home address. This field is an optional field. compared to this field, in addition to the personal information, you do not need to read or rewrite this field frequently. So why don't I put him in another table? This will make your table have better performance. if you think about it in a large number of cases, only user IDs, user names, passwords, and user roles will be frequently used in user tables. Small tables always have good performance.
Example 2: You have a field named "last_login", which will be updated every time a user logs on. However, each update will clear the query cache of the table. Therefore, you can put this field in another table, which will not affect your constant reading of user IDs, usernames, and user roles, because the query cache will help you increase a lot of performance.
In addition, you need to note that the tables formed by the split fields do not often Join them. otherwise, such performance will be worse than when there is no division, and it will be a very few decline.
17. split large DELETE or INSERT statements
If you need to execute a large DELETE or INSERT query on an online website, you need to be very careful to avoid your operations to stop the entire website. Because these two operations lock the table, once the table is locked, other operations cannot be performed.
Apache has many sub-processes or threads. Therefore, it works very efficiently, and our server does not want to have too many sub-processes, threads, and database connections, which greatly occupy server resources, especially memory.
If you lock your table for a period of time, such as 30 seconds, for a site with high access traffic, the access process/thread and database link accumulated over the past 30 seconds, the number of opened files may not only make you poll the WEB service Crash, but also make your entire server immediately