Benchmark Test and Analysis for identifying MySQL bottlenecks

Last Update:2018-12-06 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sometimes we need to optimize MySQL. So what improvements should we make to MySQL? A special query? Database mode? Server hardware? The only way is to measure what your system is doing and its performance under various conditions. This is what we will learn below.

The best strategy is to find the weakest link and enhance the composition of your application chain. This is very useful if you don't know what blocks optimal performance or what will block optimal performance in the future.

Benchmark Testing and Analysis are two basic methods to identify bottlenecks. They are associated, but they are not exactly the same. Benchmark your system performance. This will help determine the system's affordability, show you which changes are useful or useless, or display the performance of your applications under different data.

On the contrary, profiling helps you find out where your application spends a lot of time or consumes a lot of resources. In other words, the benchmark can answer "how is this execution ?", The analysis can answer "why is it executed like this ?"

We are going to describe two parts in this chapter: Benchmarking and profiling. We started to discuss the Causes and Countermeasures of the benchmark, and then introduced the benchmark (or ruler) for the specific benchmark ). First, we will show you how to plan and design benchmarking, design precise results, and perform benchmarking and analysis results. Finally, let's take a look at the benchmarking tools and examples of how to use them.

The remaining sections describe how to optimize the application and MySQL. We will show in detail the performance of the application that we have applied to production help analysis, and the real optimization code. We will also show how to record MySQL query statements, analyze logs, use MySQL status counters, and other tools used to view MySQL and your query statements.

Why do we need a benchmark?

Many large and medium-sized MySQL deployments have dedicated benchmarks for benchmarking. However, every developer and DBA should also be familiar with basic benchmarking and operations because they are very useful. The following are some of the things that Benchmark Testing can help you:

Measure how your application is currently executed. If you do not know how fast your application is currently running, you cannot determine which changes are useful. You can also use historical benchmark results to diagnose unexpected problems.
Verify the scalability of your system. You can use Benchmark Testing to simulate much more load than your production environment can handle, such as increasing users by hundreds of times.
Planned growth. Benchmarking helps you evaluate the amount of hardware, network capacity, and other resources you need to estimate the expected load in the future. This can help reduce risks when upgrading or when a large number of applications change.
Test the affordability of your applications in a changing environment. For example, you can find out how your applications are executed at irregular peaks in concurrency or with different server configurations, or you can see how it is processed in different data distributions.
Test different hardware, software, and operating system configurations. Is RAID5 or raid10 better for your system? When you switch from an ATA disk to a SAN storage, how does the random write performance change? IS 2.4 Linux kernel better than 2.6? Does MySQL upgrade help improve performance? Does different storage engines affect your data? You can use different benchmarks to answer these questions.

For other purposes, you can also use benchmarking, such as creating a unit test suite for your application, but here we only focus on performance-related aspects.

2. benchmark policy

There are two basic benchmark testing strategies: You can take an application as a whole, or isolate MySQL and test with a benchmark question. These two strategies are well known for both full-stack and single-component benchmarking. Test the entire application, not just MYSQL:

You test the entire application, including Web Services, application code, and databases. This is very useful because you are not only concerned with MySQL performance, but also the entire application.
MySQL is not always the bottleneck of the application. This can be proved by the Full-site benchmark test.
Only by testing the entire application can you know the caching behavior of each part.
Benchmarking is good to some extent because it reflects the real behavior of your application. It is hard to detect behavior when you test a module separately.

On the other hand, it is difficult to create an application benchmark, or even install it correctly. If your benchmark design is poor, you will come to a wrong conclusion, because the results do not reflect the real situation.

However, sometimes you don't want to understand the entire application. In the initial stage, you may only want to know about MySQL benchmarking. The following benchmark tests are useful:

You want to compare different modes or query statements
You want to test a special problem in the application
Compared to the long-term benchmark, you prefer a short benchmark to show you the fast "cycle time" for marking and measuring changes ".

When you repeatedly repeat your application query statements in a real dataset environment, benchmarking MySQL is very useful. The dataset itself and the size of the dataset must be real. If possible, create a data snapshot in the production environment.

Unfortunately, building a real benchmark is very complex and time-consuming. If you can get copies of datasets in the production environment, you're lucky. Of course, this may not work. For example, you may have developed a new application with only a few users and data. If you want to know what will happen if it becomes large, you have no choice except to simulate larger application data and load.

Test what?

Before you start the benchmark test, or even before you design the test, you need to determine your goals. Your goals will determine your tools and technologies to get accurate and meaningful results. Use questions to design your goals. For example, "Is there a lot of CPU ?" Or "is the new index faster than the current index ?"

It cannot be obvious, and you need to use different methods to test different things. For example, latency and throughput require different benchmark tests.

Consider the following metrics and how they improve your performance goals:

Transaction volume per unit time

This is a classic historical database application. Standardized tests such as TPC-C standards (see http://www.tpc.org) are widely cited by many database providers and work very hard to make them work well. These benchmarks test the performance of online processing (OLTP), which are most suitable for multi-user transaction applications. The unit of measurement is the transaction volume per second.

The term throughput usually refers to the transaction volume per unit of time (or other units of work ).

Response time or latency

This measures the total time required for a task. Depending on your application, you may need to measure in milliseconds, seconds, or minutes. Here you can get the average response time, minimum response time, and maximum response time.

The maximum response time is rarely useful, because the longer the benchmark test runs, the larger the maximum response time may be. It cannot always be repeated, which may lead to a big gap during the running process. For this reason, many people use the percentage response time. For example, if 95% of the response time is 5 ms, you can know that the task can be completed in less than 5 ms within 95% of the total time.

It is very helpful to draw the benchmark test results, for a graph or a linear graph (for example, average value and 95% percentage) or a hash graph, because you can see the distribution of the results. Through these figures, we can see how the benchmark is executed during a long period of operation.

Assume that your system performs a one-minute detection every hour. During the detection period, the system "broke down" and no transaction was completed. 95% of the response time does not show the peak value, so the result will mask this problem. However, a graph shows the periodic peak values in the response time. Figure 2-1 illustrates this.

Figure 2-1 shows the transaction volume per minute. The line shows a symbolic peak that exceeds the average value. The first peak value is because the cache of the server is frozen, and the other peak value shows the time it took for the server to refresh dirty pages to stabilize to the disk. It is hard to see these differences without graphs.

Stability

For the system, stability testing is very important because the system needs to maintain performance under changing workloads.

"Maintaining performance under a changing workload" is a very abstract concept. Performance can be measured, such as throughput and response time. The workload may vary with the database size, current connections, or hardware.

Stability testing is good for evaluating the system's bearer capacity, because it shows the weak links in your application, but not in other benchmark tests.

Figure 2-1 30-minute running result

For example, if you perform a response time test with a single link (poor test strategy), the system performance you designed is good, but at any level of concurrency, your application may be poor. A test focuses on the continuous response time under the increasing connection, so that we can see the design flaws.

There are some activities, such as collecting granular data to create periodic Batch jobs of Summative data tables, which only requires quick response time. It is good to simply test the response time, but you also need to care about how they interact with other activities (Mutual influence. Batch jobs may cause poor performance of interactive query statements, and vice versa.

Concurrency

Concurrency is very important, but it is often abused and incorrectly measured. For example, there is a popular saying that the number of users simultaneously browsing the website. However, HTTP is stateless, and most users simply read the content displayed by the browser, which cannot be converted to the concurrency of the Web server. Similarly, the concurrency on the Web server is not necessarily converted to the database server. Directly related to how much data can be processed by your session storage mechanism. A more accurate method to test the concurrency of web servers is the number of requests per second during peak hours.

You can also test concurrency in different places of the application. The higher the concurrency on the Web server, the higher the database concurrency level. However, the language and toolkit may affect it. For example, the Java connection pool may reduce the concurrent connection of the MySQL server than the PHP with persistent connection.

More importantly, the number of concurrent queries executed within a given period of time. A well-designed application may open hundreds of concurrent MySQL servers, but a few of them should execute query statements at the same time. In this way, a "50,000 users online at the same time" web site may only require 10 ~ 15 query statements are executed simultaneously.

In other words, the benchmark You Really Want To care about is the number of concurrent jobs, threads, or working connections at the same time. Test the performance when concurrency increases. If so, your application may not be able to handle the peak values under high loads.

You also need to ensure that the performance will not quickly degrade, or design the application so that it will not generate high concurrency that cannot be processed in all parts of the application. In general, you need to design a limit on the concurrency of the MySQL server, such as the application queue.

Concurrency cannot be exactly the same as response time and stability: it is not a result, but an attribute of how you establish a benchmark test. You should test the application performance at different concurrency levels, rather than testing the concurrency that your application can achieve.

In short, you should test what is important to users. Tests measure performance, but "performance" means something different for different people. Collect requirements (formal or informal) about how the system should be measured, acceptable response times, expected concurrency types, and so on. Then, try to design your tests to explain all the requirements, instead of "The frog at the bottom" to exclude other things from focusing on something.

3. Test Benchmark

With a general understanding, let's turn to how to design and execute benchmarking. Before we discuss how to complete the benchmark test, let's take a look at some common errors that can result in unavailability or inaccurate results:

Use a subset of the actual data size. For example, when an application has to process several hundred GB of data, we only use 1 GB of data. Or when you are about to expand your application, use current Dataset
Use incorrect data distribution, for example, data distribution of "Hotspot" rules in real system data (randomly generated data is usually impractical distribution ).
Use unrealistic distribution parameters, for example, assuming that all user configuration files are browsed.
Use a single user scenario in multi-user applications.
Test distributed applications on a single server.
Compared with the behavior of real users, for example, "thinking time" on the web page ". Real users request and read it; they do not click the link one by one without stopping.
Execute the same query statement in a loop. Real query statements are different, so they may cause cache miss. The same query statement will be cached in whole or in part at a certain level.
Failed to check error. If the result of a benchmark test is meaningless-for example, if a slow operation is suddenly completed very quickly, the check is incorrect. You can test how fast MySQL can detect syntax errors in an SQL query! In principle, the error log should be checked after each test.
When the system is not hot, ignore how the system is executed, for example, after the system has just restarted. Sometimes you need to know how long it takes to reach the load capacity after your server is restarted, so you need to observe during hot start. On the contrary, if you want to study its normal performance, you need to care about it. If your test happens after the restart and many caches will be frozen, the test results will not be reflected, the result obtained under the load when the cache becomes hot.
Use the default service settings.

It takes a long time to improve the quality of your results to avoid these errors.

All other things are the same. You should test them in a realistic environment as much as possible. Although sometimes it is wise to use a test that is a little less authentic. For example, assume that your application is deployed on different hosts. Using the same configuration for testing will be closer to the actual situation, but this will add more variables, such as how much network load and how fast. Testing on a single node is often very simple, but in some cases, it will be more accurate. It is up to you to determine when to use the most appropriate tool.

Design and planning Testing

The first step in planning the test is to determine the problem and goal. Then, decide whether to use the standard test or your own design.

If you use a standard test, make sure that the test you selected meets your needs. For example, do not use TCP to test your E-commerce system. In TCP's own words, TCP "". Therefore, it is not a suitable test for the OLTP system.

Designing your own tests is a complex and repetitive process. Start by using the snapshot of the dataset in your production environment. Make sure that you can restore these datasets for later running.

Then, you need to run the query statement in the data. You can add a unit test suite to a basic test and perform it multiple times. However, this is unlikely to match how you actually use the database. A better way is to record all the query statements in your production environment within a typical time frame, for example, one hour in a peak load or one whole day. If you record the query statement within a short time frame, you may need to select several time frames. This will overwrite all system activities, such as weekly report query statements, or execute scheduled tasks during low peak hours.

You can record query statements at different levels. For example, if you need a full-stack test, you can record the HTTP requests on the Web server. You can also enable MySQL query logs. However, if you replay query logs, make sure that you re-create a separate thread to repeat each query statement linearly. It is also important to create a separate thread for each connection in the log to avoid query blocking between threads. The query log shows the connection where the query statement is executed.

Even if you haven't built your own tests, you can write down your test plan. You can run the test many times, and you need to re-accurately build your test. We plan for the future. You may not be the person executing the test next time. Even if you are, you may not remember how you performed the test for the first time. Your plan should include test data, system installation steps, and hot start plan.

Design some methods to standardize parameters and results, and record each execution in detail. Your document methods may be as simple as workbooks or notes, or as complicated as customized databases (but remember, you have to write some scripts to help analyze the test results, so there is no easier way than opening workbooks and text files ).

You may find it useful to create a test directory that contains subdirectories of the results of each execution. In the corresponding sub-directories, you can put the results, configuration files, and notes for each execution. If you have more tests than you expected and are interested in, record additional data anyway. Missing Important data records is always better than unwanted data. You may find additional data useful in the future. Record as much additional information as possible during the test, such as CPU usage, disk I/O, and network traffic statistics; show global status counters.

Get accurate results

The best way to get accurate results is to design your tests to answer the questions you want. Have you selected the right test? Have you captured the data of the answers you need? Are there any error standards in your test? For example, have you run a computing-Intensive Test to predict the performance of I/O-intensive applications?

Next, make sure that your test results can be repeated. Make sure that your system is in the same status every time you start execution. If the test is important, restart the system after each execution. If you need a server that has been preheated, you should also ensure that your system has been preheated long enough. For example, if the push process contains a random query, your test results will not be repeated.

If the test changes the test data or database mode, use a snapshot to reset it each time it is executed. Insert one thousand rows of records into a table and insert 1 million rows of records into a table without the same results. Data storage and distribution on disks also make the results repeatable. One way is to make sure that the physical layout is similar and make a fast format and file copy partition.

Beware of additional loads, optimization and monitoring systems, detailed logging, scheduled tasks, and other factors that can offset your results.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More