Java Interview Preparation 16: Database--mysql performance optimization

Source: Internet
Author: User
Tags compact mysql query

Here just to record, because of their own level is not very good, it is inevitable that mistakes are numerous, the place of error is also hope that many points, thank you.

Best 20+ experience from MySQL performance optimization

    1. Optimize your query for query caching
    2. Explain your select query
    3. When only one row of data is using limit 1
    4. Jianjian Index for search words
    5. Use a fairly type of column when the join table is used, and index it
    6. Never order by RAND ()
    7. Avoid SELECT *
    8. Always set an ID for each table
    9. Use enum instead of varchar
    10. Get advice from procedure analyse ()
    11. Use not NULL where possible
    12. PreparedStatement
    13. Non-buffered queries
    14. Save IP address as UNSIGNED INT
    15. Fixed-length tables are faster
    16. Vertical split
    17. Splitting a large DELETE or INSERT statement
    18. The smaller the column, the quicker it will be.
    19. Choosing the right Storage engine
    20. Use an Object-relational mapper (relational Mapper)
    21. Be careful with "permanent link"

1. Optimize your query for query caching
Most MySQL servers have query caching turned on. This is one of the most effective ways to improve sex, and this is handled by the MySQL database engine. When many of the same queries are executed multiple times, the results of these queries are placed in a cache so that subsequent identical queries do not have to manipulate the table directly to access the cached results.

The main problem here is that this is a very easy thing to ignore for programmers. Because, some of our query statements will let MySQL not use the cache. Take a look at the following example:

// 查询缓存不开启$r = mysql_query("SELECT username FROM user WHERE signup_date >= CURDATE()");// 开启查询缓存$today = date("Y-m-d");$r = mysql_query("SELECT username FROM user WHERE signup_date >= ‘$today‘");

The difference between the two SQL statements above is curdate (), and the MySQL query cache does not work for this function. Therefore, SQL functions such as now () and RAND () or whatever, do not turn on the query cache because the return of these functions is variable. So all you need to do is use a variable instead of the MySQL function to turn on the cache.

2. EXPLAIN your SELECT query
Use the EXPLAIN keyword to let you know how MySQL handles your SQL statements. This can help you analyze the performance bottlenecks of your query statement or table structure.

EXPLAIN's query results will also tell you how your index primary key is being leveraged, how your data tables are searched and sorted ... Wait, wait.

3. Use LIMIT 1 when only one row of data is used
When you query a table, you already know that the result will only have one result, but because you might need to fetch the cursor, or you might want to check the number of records returned.

In this case, adding LIMIT 1 can increase performance. This way, the MySQL database engine stops searching after it finds a piece of data, instead of continuing to look for the next record-compliant data.

The following example, just to find out if there are users of "China", it is obvious that the latter will be more efficient than the previous one. (Note that the first one is select *, and the second is select 1)

// 没有效率的:$r = mysql_query("SELECT * FROM user WHERE country = ‘China‘");if0) {    ...}// 有效率的:$r = mysql_query("SELECT 1 FROM user WHERE country = ‘China‘ LIMIT 1");if0) {    ...}

4. Jianjian Index for search words
Refer to Java interview Preparation 15: Database--Index

5. Use a fairly typed example in the Join table and index it
If your application has many join queries, you should confirm that the fields of join in two tables are indexed. In this way, MySQL internally initiates the mechanism for you to optimize the SQL statement for join.

Also, the fields that are used for join should be of the same type. For example, if you want to join a DECIMAL field with an INT field, MySQL cannot use its index. For those string types, you also need to have the same character set. (Two tables may not have the same character set)

6. Never ORDER by RAND ()
If you really want to disrupt the data rows that you return, there are n ways you can achieve this. This will only degrade the performance of your database exponentially. The problem here is that MySQL will have to execute the rand () function (which consumes CPU time), and this is done for each row of records to be recorded and then sorted. Even if you use limit 1 it doesn't help (because you want to sort)

The following example randomly picks a record

-- 千万不要这样做SELECTFROMORDERBY10.252s-- 这样会更好SELECT COUNT(*) FROM actor;-- 然后在程序代码里,返回0~count(*)-1的一个随机数rand,再通过以下sql去查表SELECTFROM1;

7. Avoid SELECT *

    • The more data you read from the database, the slower the query
    • If the database is brushed away and the Web server is two separate machines, it also increases the load on the network transport
      So, you should get into the habit of what data you need to take.

8. Always set a primary key for each table
We should set an ID for each table in the database as its primary key, and preferably an int type (recommended to use unsign), and set the AUTO_INCREMENT flag to be automatically added.

Even if you have a field in the users table that has a primary key called "email", you should not let him become the primary key. Using a varchar type to use as the primary key can degrade performance. In addition, in your program, you should use the table's primary key to construct your data mechanism.

Also, under the MySQL data engine, there are some operations that need to use primary keys, in which case the performance and setting of the primary key becomes very important, such as clustering, partitioning .... )

In this case, there is only one exception, which is the "foreign key" of the "association table", that is, the primary key of the table, which consists of the primary key of several other tables. We call this the "foreign key". For example, there is a "student table" has a student ID, there is a "curriculum" has a course ID, then the "score table" is the "association table", which is associated with the student table and curriculum, in the score table, student ID and course ID is called "foreign key" it together to form a primary key.

9. Use ENUM instead of VARCHAR
The enum type is very fast and compact. In fact, it holds the tinyint, but it appears as a string on its appearance. In this way, using this field to make a list of options becomes quite perfect.

If you have a field such as "gender", "Country", "nation", "state" or "department", you know that the values of these fields are limited and fixed, then you should use ENUM instead of VARCHAR.

MySQL also has a "suggestion" (see article tenth) to show you how to reorganize your table structure. When you have a VARCHAR field, this suggestion will tell you to change it to an ENUM type. With PROCEDURE analyse () you can get advice.

10. Obtaining recommendations from PROCEDURE analyse ()
PROCEDURE analyse () will let MySQL help you analyze your fields and their actual data, and will give you some useful advice. These suggestions will only become useful if there is actual data in the table, because it is necessary to have data as a basis for making some big decisions.

For example, if you create an INT field as your primary key, but there is not much data, then PROCEDURE analyse () suggests that you change the type of the field to Mediumint. Or you use a VARCHAR field, because there is not much data, you might get a suggestion that you change it to an ENUM. These suggestions are probably because the data is not enough, so the decision-making is not accurate.

It is important to note that these recommendations only become accurate when the data in your table is getting more and more. Be sure to remember that you are the one who will make the final decision.

11. Use not NULL where possible
Unless you have a very special reason to use null values, you should always keep your fields not NULL. This may seem a bit controversial, please look down.

First, ask yourself how big the difference is between "Empty" and "null" (if it's int, that's 0 and null)? If you feel that there is no difference between them, then you should not use NULL. (Do you know?) In Oracle, NULL and Empty strings are the same! )

Do not assume that NULL does not require space, that it requires extra space, and that your program will be more complex when you compare it. Of course, this is not to say that you cannot use NULL, the reality is very complex, there will still be cases where you need to use a null value.

PreparedStatement.
PreparedStatement is much like a stored procedure, a collection of SQL statements running in the background, and we can use PreparedStatement to get many benefits, whether it's a performance issue or a security issue.

PreparedStatement can check some of the variables you've bound so that you can protect your program from "SQL injection" attacks. Of course, you can also manually check these variables, however, manual checks are prone to problems and are often forgotten by programmers. When we use some framework or ORM, this problem is better.

In terms of performance, when an identical query is used many times, this will give you a considerable performance advantage, you can define some parameters for these preparedstatement, and MySQL will only parse once.

13. Non-buffered queries
Normally, when you execute an SQL statement in your script, your program stops there until the SQL statement returns, and your program continues to execute. You can use unbuffered queries to change this behavior. In particular, those query statements that produce a lot of results, and you don't have to wait until all the results are returned, you can start working on the query results as soon as the first row of data is returned.

14. Save the IP address as UNSIGNED INT
Many programmers create a varchar (15) field to hold IP in the form of a string rather than a shaped IP. If you use plastic to store it, you only need 4 bytes , and you can have a fixed-length field . Moreover, this will bring you the advantage of querying, especially when you need to use such a where condition:between Ip1 and Ip2.

We must use unsigned INT because the IP address will use the entire 32-bit unsigned shaping.

Instead of your query, you can use Inet_aton () to turn a string IP into a shape, and use Inet_ntoa () to turn an integer into a string IP. As follows:

-- 字符串IP转成整形select INET_ATON(‘127.0.0.1‘) from dual;-- 整形转成字符串ipselect INET_NTOA(2130706433) from dual;

15. Fixed-length tables are faster
If all the fields in the table are "fixed-length," The entire table is considered "static" or "" Fixed-length. For example, there are no fields of the following types in the table: VARCHAR, TEXT, BLOB. As long as you include one of these fields, the table is not a fixed-length static table, so the MySQL engine will handle it in a different way.

Fixed-length tables can improve performance because MySQL searches faster because these fixed lengths are easy to calculate the offset of the next data, so the nature of reading will be fast. And if the field is not fixed, then every time you want to find the next one, you need the program to find the primary key.

Also, fixed-length tables are more likely to be cached and rebuilt. However, the only side effect is that a fixed-length field wastes some space, because the field is set to allocate so much space whether you want to use it or not.

Using the "vertical split" technique (see the next one), you can split your table into two that are fixed-length and one that is not fixed-length.

16. Vertical Segmentation
"Vertical Segmentation" is a method of turning a table in a database into several tables, which reduces the complexity of the table and the number of fields for optimization purposes.

Example one:
There is a field in the user table that is a home address, which is an optional field, and you do not need to read or rewrite this field frequently in addition to your personal information when working in a database. So, why not put this field in another table? This will give your table a better performance.

Example two:
You have a field called "Last_login" that will be updated every time a user logs in. However, each update causes the table's query cache to be emptied. So, you can put this field in another table, so that it does not affect the ID, user name, user role of the non-stop reading, because the query cache will help you to add a lot of performance.

In addition, you need to note that these separated fields formed by the table, you do not regularly join them, otherwise, such performance will be worse than not split, and, is the decline of the number of points.

17. Splitting a large DELETE or INSERT statement
If you need to perform a large delete or insert query on an online website, you need to be very careful to avoid your actions that will stop your entire site from responding. Because these two operations will lock the table, the table is locked, the other operations are not in.

Apache will have a lot of child processes or threads. So, it works quite efficiently, and our servers don't want to have too many sub-threads, threads and database links, which is a huge amount of server resources, especially memory.

If you lock your watch for a period of time, such as 30s, then for a site with a high volume of traffic, this 30s of accumulated access to the process/thread, database links, open the number of files, may not only make your Web server crash, may also let your entire server immediately hang up.

So if you have a big deal and you want to split him up, using the LIMIT condition is a good way. Here is an example:

while(true){  //每次只做1000条  mysql_query("DELETE FROM logs WHERE log_date <= ‘2009-11-01‘ LIMIT 1000");  if(mysql_affected_rows()==0){     //没得可删了,退出      break;  }  usleep(5000);}

18. The smaller the column the faster
For most data engines, hard disk operations can be the biggest bottleneck. So it's very helpful to have your data compact, because it reduces access to the hard drive.

You can refer to the MySQL documentation storage requirements to see all the data types.

If a table has only a few columns (for example, a dictionary table, a configuration table), then we have no reason to use int to master the keys, using mediumint, smallint or smaller tinyint will be more economical. If you don't need to record time, using date is much better than DateTime.

Of course, you also need to leave enough space for expansion, otherwise, you will do this later, you would die very difficult to see, referring to the Slashdot example, a simple ALTER TABLE statement took 3 hours, because the inside
There are 16 million data.

19. Choose the right storage engine
There are two storage engines MyISAM and InnoDB in MySQL, and each engine has pros and cons. MyISAM
Suitable for applications that require a large number of queries, but are not very good for a lot of write operations. Even if you just need to update a field, the entire table will be locked and other processes will be unable to manipulate the read process until the read operation is complete. In addition, MyISAM is extremely fast for select COUNT (*) operations.

InnoDB trend will be a very complex storage engine, for some small applications, he will be slower than MyISAM. He is in favor of "row lock", so when the writing operation is more, will be more excellent. Also, he supports more advanced applications, such as: transactions.

20. Using an Object-relational mapper (relational Mapper)
With ORM, you can gain reliable performance gains. An ORM can do everything, and it can be written manually. However, this requires a senior expert.

The most important thing about Orm is "Lazy loading", that is to say, only when the need to take the value of the time to really do. But you also need to be careful about the side-effects of this mechanism because it is likely to degrade performance by creating many, many small queries. (I don't know the last sentence.) )

ORM can also package your SQL statements into a single transaction, which is much faster than executing them alone. (What do you mean?) )

21. Be careful with "permalink"
The purpose of the permanent link is to reduce the number of times the MySQL link is recreated . When a link is created, it will always be in the state of the link, even if the database operation is finished, and automatically our Apache begins to reuse its subprocess-that is, the next HTTP request will reuse Apache's subprocess and reuse the same MySQL link.

In theory, this sounds very good. But from personal experience (and most people), this function creates more trouble. Because, you only have limited number of links, memory problems, file handles, and so on.

And, Apache runs in an extremely parallel environment, creating a lot of processes. This is why this "permanent link" mechanism is not working well. Before you decide to use permanent link, you need to think about the architecture of your entire system.

Java Interview Preparation 16: Database--mysql performance optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.