MySQL Optimization considerations

Source: Internet
Author: User
In the operation of the database, we need to optimize MySQL processing, this article is about the optimization considerations.

1th, the hardware is too old
Hardware We here mainly from the CPU, memory, disk three aspects, there are a number of factors such as network card, computer room network, such as the article length of the relationship, not one after another, there are opportunities to chat.
First, let's look at MySQL's use of CPU:
5.1 can take advantage of 4 cores, 5.5 can be used to 24 cores, 5.6 can take advantage of 64 cores
For example, MySQL5.6 can use more than 48 cores, run well, 64 cores can be used (between 48core-64core, officially released 48 core, I actually test can run to 64 core).
MySQL 5.6 can be used to core+
*mysql up to 4 cores in front of 5.1 * *
Now the general production environment server, are more than 32CORE.
So I recommend that you try to use MySQL5.5 or MySQL5.6, unless your company's server has been using a very old server, only 4 cores, or 1 cores.
Because 5.1 before (5.0) are written in the internal code is dead, is based on the Innobase storage engine, the database of hardware utilization is poor. Later evolved to innodb the engine, a lot better.
Each connection is a thread (not a thread pool), and each query can use only one kernel
In addition, you can use only one CPU per query in MySQL.
Oracle uses parallel SQL, parallel query, such functions in MySQL does not exist.
No execution plan cache (no SQL execution plan precompilation)
Second, there is no SQL precompilation inside MySQL. Therefore, there is no such structure as the library cache in Oracle memory architecture. So, MySQL only hard parsing, there is no soft parsing, there is nothing soft and soft parsing.
MySQL performance drops as the number of connections goes up
This is also a problem with MySQL, but as the MySQL version evolves, there are a lot of workarounds.
For example: The official thread pool (thread pool), referred to as TP. is to solve the problem of excessive number of concurrent connections, but this is the additional MySQL components, the official TP is required to purchase additional money.
In addition, there is a domestic called Lou Fangxin, developed a onesql middleware, but also to solve similar problems.
There is a result cache, but more chicken
MySQL also has a similar result cache in Oracle, called query cache, but it is a relatively chicken function, rarely used.
Because most of the actual production environment is OLTP system, there are frequent updates and modifications, this query cache is used in the environment of frequent updating of data, the performance of MySQL will be severely degraded, so it is seldom used.
Now with MySQL, the basic use of InnoDB storage engine, the previous MyISAM these engines have been used very little. (What is a storage engine?) If you don't know, you can GG.
InnoDB engine is absolutely no need to open this query cache, because itself is a transaction-type storage engine, with InnoDB is to use its transaction processing power, will certainly occur frequent data updates and modifications.
Look again at MySQL's memory utilization features
64-bit operating system server available memory ((2^64-1)/1024/1024/1024) G
In a high-speed concurrency environment, the basic memory cache is to reduce the IO impact on the disk
Usually the memory is planned according to the 15%-20% of the actual data, and if the data is particularly hot, a larger scale is needed to cache the data
This 15%-20% data is often called thermal data. (This is also a common experience value)
For example, if you estimate that the total amount of MySQL data is about 500G, then the memory MySQL will give is probably 75G (5000.15), then you may need a 128G of memory server.
In addition, some businesses will have a particularly hot, large number of hot data (far beyond the 15%-20% this interval is also possible), such as: QQ Farm.
I believe everyone has played the previous kind of stealing food games, QQ Farm, Happy Farm and so on. (There are also 12306 sites for booking).
This kind of business in our industry is a high degree of concern, the characteristics of such business, data hot, the basic 100% are hot data, such as: QQ farm everyone play, every day to play, every minute to steal the dish, a lot of people get up in the middle of the toilet to steal a dish.
So the MySQL database of this kind of business, the memory equip also has to heighten. 15–20% is not enough.
Summary: * * * general business 15%-20% to plan hot data, such as: User Center, orders and other common business. Other special points of the business, the specific situation of specific analysis.
Can be assigned according to query response time.
We are doing this large-scale online architecture-large database planning and design,
The response time of SQL queries is also a very important indicator.
In this large-scale system, to carry millions of or even tens of thousands of users at the same time online business, SQL query response time must be strict control, you have to control the system query response time within how much time.
For example, in our core library, I asked query response time (average response) to be below 30ms. More than 30ms, we think this database can reach the load limit, need to expand the database.
In addition, the query response time for the long-term monitoring of the indicator.
This is the core library, if other less important auxiliary libraries, such as the library to put the log, or some performance requirements itself is not too high library, we can relax the query response time, relaxed to 1 seconds or 2 seconds.
The threshold for the query response time is determined based on the level of importance of the business.
This is an important guideline to plan your performance capacity based on query response time.
There are two types of capacity: performance capacity and space capacity. Space capacity is simple, that is, how much size data, a few t.
Performance capacity is more important, deciding whether to catch your business pressure and load.
We must remember: if you want to resist the business is millions other active users, not hundreds of users, performance is king, performance to meet the needs of the business is the most important.
You function again cow B, the product is good, performance resistance, the other is the tear, hundreds of w people may be in a few seconds to put your entire system and project all hang off, and then your company flying blind.
The painstaking user will also lose a lot of money, the loss will be heavy.
Performance is the basis. Performance is anti-live, and the whole architecture makes sense. Performance resistance, back to consider what high availability, these are useless.
MySQL's use of disk features
Binlog,redo log, undo log sequential Io
MySQL has a variety of IO types.
Binlog,redolog,undolog, these are sequential IO writes.
This kind of thing does not have much need to put on the SSD, in order to write on the traditional mechanical disk is also very fast, put on the SSD a bit throwaway, and SSD has write loss and write life problems, there is no need to put on the SSD. It's enough to put it on a traditional SAS disk. There's no need to put SSDs on.
SSDs are used to put datafile. Because most of the IO that occurs on datafile is random IO,SSD running random io is very advantageous. SSD SSD + traditional disk SAS disk with mixed storage. Also, do not use SSDs for backup disks.
datafile random io combined with sequential IO
Sequential Io is always faster. In the database design, deciding whether you are a DBA or an architect of cow B is to see if you can design a business as sequential IO and reduce random io. For example: A friend relationship business, design when you want a query to order IO to take out friends relationship, then how to design it?
In MySQL's InnoDB, we can take advantage of a feature of InnoDB: a clustered Index table. (like Oracle's IoT).
This feature allows the user's friend data to be aggregated as much as possible in a page or multiple adjacent page. When reading a sequential read IO can be done, performance greatly improved.
The Buddy relationship table is structured as follows (the InnoDB engine is the prerequisite table):
owner_id friend_id (friend ID)
The two fields above make a primary key, and the primary key of InnoDB is the clustered index, which can be done by reading the two fields in a certain order IO.
Before what database design book, always said, each table must add a self-increment of the primary key specification, in fact, the standard dead, the response is live, I above example of the friend relationship does not use the self-increasing primary key, but with business properties read and frequent two business fields as the primary key, but better performance.
Therefore, we learn, do not go to rote these books on what norms and regulations, but should really learn a thing of the principle, such as learning InnoDB internal principles, and then in the actual work, there is the principle of support, with the principle to extrapolate.
The principle of InnoDB is a great piece of knowledge, which requires a lot of learning. You can pay more attention to my public number, there will be innodb some articles launched.
The OLTP business needs more random IO
Memory can be used to cache, thus reducing random IO
OLAP Services need sequential IO more
Memory Cache does little work
MySQL5.6 is not supported to modify the page before, the default is 16K.
MySQL5.6 later can be changed, this parameter is innodb_page_size, but MySQL5.6 can only be modified to 8 K or 4 K, can not be adjusted to large, until MySQL5.7 above can be changed greatly 32K or 64K.
For OLAP systems, a larger page can help with performance improvements because OLAP systems are large queries that scan a lot of data.
2nd: Database design is not good
For example, with a lot of database features, such as trigger, partitioning, a lot of stored procedures, functions and so on.
What we often say, small and beautiful, means that simplicity is the best. You use all the functions of the database, the performance of the database will naturally be slowed down, may encounter the bug, the probability of the underlying failure is increased.
So we have to understand that a good database project design, is small and beautiful, refined and simple. In addition, the database is only part of the overall project, such as trigger, stored procedures can be implemented, in the overall project can also be used in the application code to complete.
So, we use MySQL, that is, to use its powerful places, such as tables, indexes, transactions, and not to have all its functions to be used.
In addition, sub-queries are not allowed in the main library of the production environment before MySQL5.6.
The performance of the subquery before MySQL5.6 is particularly poor. (syntax is supported, but SQL performance is very poor).
For example, if you are using Oracle to migrate Oracle to MySQL, we recommend that you use the MySQL5.6 version, MySQL5.6 Support and performance of the sub-query has been greatly improved.
MySQL5.6 Run query performance will be greatly improved.
3rd: The program is too bad to write

This estimated that when the DBA students should be experienced, small and medium-sized companies, the level of programmers are uneven.
In particular, many of the newly-introduced programmers (just graduated), more likely, these just-in-the-way programmers are still in the hands of some very demanding progress. That kind of environment under the development of procedures, it is difficult to think of rotten.
Of course, this does not blame our programmers, can not blame them.
Causes me the above phenomenon reason, mainly is the domestic development environment, also did not have the method, the development demand urgent (product daily urges live), the programmer is busy in the rush (long-term overtime), only busy and realizes the business procedure, does not have the time to optimize the procedure.
Of course, in this environment, it's a chance for our DBAs. Programmers write out of the rotten SQL, complex SQL, causing the system to slow down or even crash, and then we DBA, to these rotten SQL, slow SQL optimized transformation, the system returned to normal, and increasingly stable. It is also a matter of great accomplishment and respect by colleagues and leaders.
At the same time, DBAs can also strengthen the training of programmers, and enhance their ability to write good SQL quickly. Let them spend less time, also can write better performance, more smooth SQL statement. This can also relieve the DBA of the burden.
I personally prefer to talk to the programmer training, one exchange technology, all have the harvest, and secondly good relations, work on what will need to negotiate after the talk. It's better than having them eat.
We write about the program is too bad, mainly has the following several solutions direction:
For applications to use database connection pooling, especially in large, high-concurrency applications such as Java-based development, connection pooling must be used.
The benefit of using a connection pool is that you can limit the number of connections to the app, and you don't have to create each connection any more, and MySQL creates a connection with a higher cost because creating a new connection is equivalent to MySQL creating a thread.
As I mentioned just now, MySQL will decrease in performance as the number of connections increases.
Have written code of the classmate, should also know, in our General PC notebook (General 4CORE), you create 400 thread, each thread on the dry 1+1+1+1+. Simple live, and then sleep, you see your PC Card or not card. You'll find that your PC's CPU is running full. If you dare to create 600 thread, then your machine will be waiting for you to focus on the start. This is because the thread overhead, the CPU is already full.
Complex SQL statements
This is also said that the programmer writes SQL, generally have a lot of problems, they are too busy, not to consider the performance and operation of SQL. In some cases, the programmer splicing The SQL, directly can put the entire system to dry across.
Let me give you a simple example: we have an application that creates 10 connections to the database (maximum number of connections = 10), each of the 10 connections runs the same complex SQL at the same time, and executes this complex SQL at least 10 minutes, and the 10 connections can only execute this complex SQL within 10 minutes. All the rest of the SQL is blocked.
The majority of applications are unavailable for 10 minutes, right. And it could cause an avalanche, causing the system to crash.
The optimization of complex SQL is also a very important work for DBAs, which need to be monitored to find these complex SQL, slow SQL, rotten SQL, and then give the optimization recommendations to the programmer (DBA to perform performance comparison test), so that the programmer to change the code, in order to let the system really carefree run in parallel. Like a freeway without traffic jams.
That someone will ask, our company's programmer is the cow B, kill not to change the SQL code, kill also do not optimize, unable to communicate. So what do we do?
We still have a way, we can also build a dedicated from the library (slave library) to deal with, you change the library query, always can.
For example, our company's examples, our background out of the report system, is even the query from the library, not to connect the main library.
Invalid logic
Full table Scan
For example: Update t set a = a + 1; Forget the Where condition.
If you want your system to support millions other users online, then you have to join the SQL Audit system (SQL Review), to eliminate the invalid logic of SQL, and such a full-table scan of SQL.
SQL is approved by a DBA before it can be released online.
In addition, this large update SQL should be updated in batches, the large SQL task is split into small tasks to run. In MySQL, this is a special note.
Why do you want to update it in batches?
* * Cause 1. * * above, a MySQL query can only use a core. SQL transactions are too large and too complex to run for long enough to cause congestion.
Cause 2. Online environment, MySQL is generally the master/slave architecture, if master occurs 100W row of large update transactions, it is likely to cause slave card there, because Slave is a single-threaded structure, causing synchronization delay.
MySQL writes SQL, dries into small transaction SQL, executes quickly, and submits quickly. Let each query complete faster and let the connection be released faster.

This article explains the considerations of MySQL optimization, and more about it, please pay attention to the PHP Chinese web.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.