Best 20+ experience for MySQL performance optimization

Source: Internet
Author: User
Tags compact documentation rand stmt phpmyadmin

Today, database operations are increasingly becoming a performance bottleneck for the entire application, which is especially noticeable for web applications. It's not just about the performance of the database that DBAs need to worry about, it's something that our programmers need to focus on. When we design the database table structure, we need to pay attention to the performance of the data operation when we operate the database, especially the SQL statements when we look at the table. Here, we're not going to talk too much about optimizations for SQL statements, but only for MySQL, the most Web application database. Hopefully the following optimization tips are useful for you.

Optimize your query for query caching

Most MySQL servers have query caching turned on. This is one of the most effective ways to improve sex, and this is handled by the MySQL database engine. When many of the same queries are executed multiple times, the results of these queries are placed in a cache so that subsequent identical queries do not have to manipulate the table directly to access the cached results.

The main problem here is that this is a very easy thing to ignore for programmers. Because, some of our query statements will let MySQL not use the cache. Take a look at the following example:

Query cache does not open "= mysql_query (" Select username from user WHERE signup_date >= curdate () ");//Open Query Cache $today = Date (" y-m-d "); $r = mysql_query ("Select username from user WHERE signup_date >= ' $today '");

The difference between the two SQL statements above is CURDATE() that MySQL's query cache does not work for this function. So, like NOW() and RAND() or other SQL functions do not turn on the query cache, because the return of these functions will be volatile variable. So all you need to do is use a variable instead of the MySQL function to turn on the cache.

EXPLAIN your SELECT query

Use the Explain keyword to let you know how MySQL handles your SQL statements. This can help you analyze the performance bottlenecks of your query statement or table structure.

EXPLAIN's query results will also tell you how your index primary key is being leveraged, how your data tables are searched and sorted ... Wait, wait.

Pick one of your SELECT statements (it is recommended to pick one of the most complex, multi-table joins) and add the keyword explain to the front. You can use it phpmyadmin to do it. Then, you'll see a table. In the following example, we forget to add an group_id index and have a table join:

When we group_id index a field:

As we can see, the previous result shows a search of 7883 rows, and the second one searches only 9 and 16 rows of two tables. Looking at the rows column allows us to find potential performance issues.

Use LIMIT 1 when only one row of data is used

When you query a table, you already know that the result will only have one result, but because you might need to fetch the cursor, or you might want to check the number of records returned.

In this case, plus LIMIT 1 you can increase performance. This way, the MySQL database engine stops searching after it finds a piece of data, instead of continuing to look for the next record-compliant data.

The following example, just to find out if there are users of "China", it is obvious that the latter will be more efficient than the previous one. (Note that the first article is Select * , the second is Select 1 )

Inefficient: $r = mysql_query ("SELECT * from user WHERE country = ' China '"); if (mysql_num_rows ($r) > 0) {

// ...} Efficient: $r = mysql_query ("Select 1 from user WHERE country = ' China ' LIMIT 1"); if (mysql_num_rows ($r) > 0) {

// ...}

Jianjian Index for search words

The index does not necessarily give the primary key or the unique field. If you have a field in your table that you will always use to do a search, then index it.

From you can see that search string “last_name LIKE ‘a%‘” , one is built index, one is no index, performance is about 4 times times worse.

In addition, you should also need to know what kind of search is not able to use the normal index. For example, when you need to search for a word in a large article, such as: " WHERE post_content LIKE ‘%apple%‘ , the index may not be meaningful. You may need to use a MySQL full-text index or make an index yourself (say, search for keywords or tags, etc.)

In the join table, use a fairly type of example and index it

If your application has many join queries, you should confirm that the fields of join in two tables are indexed. In this way, MySQL internally initiates the mechanism for you to optimize the SQL statement for join.

Also, the fields that are used for join should be of the same type. For example: If you want to DECIMAL join a field with a field INT , MySQL cannot use its index. For those STRING types, you also need to have the same character set. (Two tables may not have the same character set)

Find company$r = mysql_query ("Select Company_Name from Users" in state

Left JOIN companies on (users.state = companies.state)

WHERE users.id = $user _id ");//Two The state field should be indexed and should be of the same type as the same character set.

Never ORDER by RAND ()

Want to disrupt the data rows returned? Pick a random data? I don't know who invented this usage, but many novices like it. But you do not understand how horrible the performance problem is.

If you really want to disrupt the data rows that you return, there are n ways you can achieve this. This use only degrades the performance of your database exponentially. The problem here is that MySQL will have to execute RAND() functions (CPU time), and this is done for each line of records to be recorded and then sorted. It doesn't matter if you use it Limit 1 (because you want to sort)

The following example randomly picks a record

Never do this: $r = mysql_query ("Select username from the user ORDER by RAND () LIMIT 1");//This will be better: $r = mysql_query ("SELECT COUNT (* ) from user "), $d = Mysql_fetch_row ($r), $rand = Mt_rand (0, $d [0]-1), $r = mysql_query (" Select username from user LIMIT $ran D, 1 "); avoid SELECT *

The more data you read from the database, the slower the query becomes. And, if your database server and Web server are two separate servers, this also increases the load on the network transport.

So, you should develop a good habit of taking whatever you need.

Not recommended = mysql_query ("SELECT * from user WHERE user_id = 1"); $d = Mysql_fetch_assoc ($r); echo "Welcome {$d [' username ']}"; recommended = mysql_query ("Select username from user WHERE user_id = 1"); $d = Mysql_fetch_assoc ($r); echo "Welcome {$d [' Usernam E ']} "; always set an ID for each table

We should set an ID for each table in the database as its primary key, and the best is a INT type (recommended UNSIGNED ), and set the flag automatically added AUTO_INCREMENT .

Even if your users table has a field called the primary key “email” , you don't have to make it a primary key. Use the VARCHAR type to degrade performance when the primary key is used. In addition, in your program, you should use the ID of the table to construct your data structure.

Also, under the MySQL data engine, there are some operations that need to use primary keys, in which case the performance and settings of the primary key become very important, such as clustering, partitioning ...

In this case, there is only one exception, which is the "foreign key" of the "association table", that is, the primary key of the table, which consists of the primary key of several other tables. We call this the "foreign key". For example: There is a "student table" has a student ID, there is a "curriculum" has a course ID, then, "Score table" is the "association table", which is associated with the student table and curriculum, in the score table, student ID and course ID is called "foreign key" it together to form a primary key.

Use ENUM instead of VARCHAR

ENUMThe type is very fast and compact. In fact, it is saved TINYINT , but its appearance is displayed as a string. In this way, using this field to make a list of options becomes quite perfect.

If you have a field such as "gender", "Country", "nation", "state" or "department", you know that the values of these fields are limited and fixed, then you should use ENUM instead of VARCHAR.

MySQL also has a "suggestion" (see article tenth) to show you how to reorganize your table structure. When you have a VARCHAR field, this advice will tell you to change it to a ENUM type. Use PROCEDURE ANALYSE() you to get the relevant advice.

Get advice from PROCEDURE analyse ()

PROCEDURE ANALYSE()Will let MySQL help you analyze your fields and their actual data, and will give you some useful advice. These suggestions will only become useful if there is actual data in the table, because it is necessary to have data as a basis for making some big decisions.

For example, if you create a INT field as your primary key, but there is not much data, then it PROCEDURE ANALYSE() is recommended that you change the type of the field MEDIUMINT . Or you use a VARCHAR field, because there is not much data, you may get a suggestion that you can change it ENUM . These suggestions are probably because the data is not enough, so the decision-making is not accurate.

In phpmyadmin , you can view these suggestions when you view the tables by clicking on “Propose table structure” them

It is important to note that these recommendations only become accurate when the data in your table is getting more and more. Be sure to remember that you are the one who will make the final decision.

Use not NULL where possible

Unless you have a very special reason to use the NULL values, you should always keep your fields NOT NULL . This may seem a bit controversial, please look down.

First, ask yourself “Empty” “NULL” how big the difference is ( INT and if so, what is it 0 NULL) ?) If you feel that there is no difference between them, then you should not use NULL . (Do you know?) In Oracle, the NULL Empty strings are the same! )

Do not assume that NULL does not require space, that it requires extra space, and that your program will be more complex when you compare it. Of course, this is not to say that you cannot use NULL, the reality is very complex, there will still be cases where you need to use a null value.

Here is an excerpt from MySQL's own documentation:

"NULL columns require additional space in the row to record whether their values is null. For MyISAM tables, each of the NULL column takes one bit extra, rounded up to the nearest byte. "

Prepared statements

Prepared statements is much like a stored procedure, a collection of SQL statements running in the background, and we can derive many benefits from using Prepared statements, whether it's a performance issue or a security issue.

Prepared statements can check some of the variables you've bound so that you can protect your program from "SQL injection" attacks. Of course, you can also manually check these variables, however, manual checks are prone to problems and are often forgotten by programmers. When we use some framework or ORM, this problem is better.

In terms of performance, this gives you a considerable performance advantage when the same query is used multiple times. You can define some parameters for these prepared statements, and MySQL will parse only once.

While the latest version of MySQL in the transmission prepared statements is using the binary situation, this makes the network transfer very efficient.

Of course, there are some cases where we need to avoid using prepared statements because it does not support query caching. But it is said that after version 5.1 was supported.

To use prepared statements in PHP, you can view its user manual: Mysqli extension or using the database abstraction layer, such as PDO.

Create prepared statementif ($stmt = $mysqli->prepare ("Select username from user WHERE state=?") {

Binding parameters

$stmt->bind_param ("s", $state);

Perform

$stmt->execute ();

Binding results

$stmt->bind_result ($username);

Moving cursors

$stmt->fetch ();

printf ("%s is from%s\n", $username, $state);

$stmt->close ();}

Non-buffered queries

Normally, when you execute an SQL statement in your script, your program will stop there until the SQL statement is returned, and your program continues to execute. You can use unbuffered queries to change this behavior.

In this case, there is a very good description in the PHP Documentation: Mysql_unbuffered_query () function:

"Mysql_unbuffered_query () sends the SQL query query to MySQL without automatically fetching and buffering the result rows As mysql_query () does. This saves a considerable amount of memory with SQL queries that produce large result sets, and can start working on t He result set immediately after the first row had been retrieved as you don ' t had to wait until the complete SQL query ha s been performed. "

The above sentence translates to saying that mysql_unbuffered_query() sending an SQL statement to MySQL does not mysql_query() automatically fethch and cache the results as it does. This can save a lot of considerable memory, especially those that produce a lot of results, and you don't have to wait until all the results are returned, and you can start working on the query results as soon as the first row of data is returned.

However, there are some limitations. Because you either read all the lines, or you want to call the purge results before you make the next query mysql_free_result() . And, mysql_num_rows() or mysql_data_seek() will not be available. So, you need to think carefully about whether to use unbuffered queries.

Save IP address as UNSIGNED INT

Many programmers will create a VARCHAR(15) field to hold IP in the form of a string instead of a shaped IP. If you use plastic to store it, you only need 4 bytes, and you can have a fixed-length field. Moreover, this will bring you the advantage of querying, especially when you need to use such a Where condition: IP between ip1 and ip2 .

We must use it UNSIGNED INT because the IP address uses the entire 32-bit unsigned shaping.

And your query, you can use to turn a INET_ATON() string IP into a shape, and use to turn INET_NTOA() a shaping into a string IP. In PHP, there are also such functions ip2long() and long2ip() .

$r = "UPDATE users SET IP = Inet_aton (' {$_server[' remote_addr ']} ') WHERE user_id = $user _id"; fixed-length tables are faster

If all the fields in the table are "fixed-length", the entire table is considered “static” or “fixed-length” . For example, there are no fields of the following types in the table: VARCHAR , TEXT BLOB . As long as you include one of these fields, the table is not a fixed-length static table, so the MySQL engine will handle it in a different way.

Fixed-length tables can improve performance because MySQL searches faster because these fixed lengths are easy to calculate the offset of the next data, so the nature of reading will be fast. And if the field is not fixed, then every time you want to find the next one, you need the program to find the primary key.

Also, fixed-length tables are more likely to be cached and rebuilt. However, the only side effect is that a fixed-length field wastes some space, because the field is set to allocate so much space whether you use it or not.

Using the "vertical split" technique (see the next one), you can split your table into two that are fixed-length and one that is indefinite.

Vertical split

"Vertical Segmentation" is a method of turning a table in a database into several tables, which reduces the complexity of the table and the number of fields for optimization purposes. (Previously, in a bank project, saw a table with more than 100 fields, very scary)

Example one: One of the fields in the Users table is the home address, which is an optional field, and you do not need to read or rewrite this field frequently in addition to your personal information when working in a database. So, why not put him in another table? This will make your table better performance, we think is not, a lot of time, I for the user table, only the user ID, user name, password, user role, etc. will be used frequently. A smaller table will always have good performance.

Example two: You have a “last_login” field called, which is updated every time the user logs on. However, each update causes the table's query cache to be emptied. So, you can put this field in another table, so that you do not affect the user ID, user name, user role of the constant read, because the query cache will help you to add a lot of performance.

In addition, you need to note that these separated fields form the table, you do not regularly join them, otherwise, this performance will be worse than not split, and, it will be a drop of magnitude.

Splitting a large DELETE or INSERT statement

If you need to perform a large or a query on an online DELETE website INSERT , you need to be very careful to avoid your actions so that your entire site stops accordingly. Because these two operations will lock the table, the table is locked, the other operations are not in.

Apache will have a lot of child processes or threads. So, it works quite efficiently, and our servers don't want to have too many child processes, threads and database links, which is a huge amount of server resources, especially memory.

If you lock your watch for a period of time, say 30 seconds, for a site with a high level of access, the 30-second cumulative number of access processes/threads, database links, and open files may not only allow you to park the Web service crash, but may also leave your entire server hanging up.

So, if you have a big deal, you have to make sure you split it, using the LIMIT condition is a good way. Here is an example:

while (1) {

Only 1000 at a time.

mysql_query ("DELETE from logs WHERE log_date <= ' 2009-11-01 ' LIMIT 1000");

if (mysql_affected_rows () = = 0) {

There's nothing to delete, quit!

Break

}

Take a break every time.

Usleep (50000);}

The smaller the column, the quicker it will be.

For most database engines, hard disk operations can be the most significant bottleneck. So it's very helpful to have your data compact, because it reduces access to the hard drive.

See MySQL documentation Storage Requirements View all data types.

If a table has only a few columns (for example, a dictionary table, a configuration table), then we have no reason to use the INT key, use MEDIUMINT , SMALLINT or smaller TINYINT will be more economical. If you don't need to record time, use it DATE DATETIME much better.

Of course, you also need to leave enough space for expansion, otherwise, you do this later, you will die very difficult to see, see Slashdot example (November 06, 2009), a simple ALTER TABLE statement took 3 hours, because there are 16 million data.

Choosing the right Storage engine

There are two storage engines MyISAM and InnoDB in MySQL, each with a few pros and cons.

MyISAM is suitable for applications that require a large number of queries, but it is not very good for a lot of write operations. Even if you just need to update a field, the entire table will be locked and other processes will be unable to manipulate the read process until the read operation is complete. In addition, MyISAM for SELECT COUNT(*) This kind of calculation is very fast incomparable.

The InnoDB trend will be a very complex storage engine, and for some small applications it will be slower than MyISAM. He is it supports "row lock", so in the writing operation more time, will be more excellent. Also, he supports more advanced applications, such as: transactions.

Best 20+ experience for MySQL performance optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.