[Study Notes] High-Performance MySQL (Third edition)-Chapter 2nd: MySQL Benchmark Test

Source: Internet
Author: User

[Study Notes] High-Performance MySQL (Third edition)-Chapter 2nd: MySQL Benchmark Test
Benchmark Testing (benchmark) is a basic skill that MySQL beginners and experts need to master. In short, benchmarking is a type of stress testing designed for the system. The general goal is to grasp system behavior. But there are other reasons, such as recreating a system status or performing a new hardware reliability test. 2.1 Why benchmark testing is the only convenient and effective way to learn what happens to the system under a given workload. The benchmark test can observe the behavior of the system under different pressures, evaluate the system capacity, master what are important changes, or observe how the system processes different data. Benchmark tests can be performed in fictitious scenarios outside of the actual load of the system. It can also be used for other purposes, such as creating a unit test suite for an application. A major problem with benchmarking is that it is not a real stress test. The benchmark test can only perform a rough test to determine the approximate margin of the system. Of course, you can also perform some real stress tests (which is different from the benchmark test), but be especially careful when constructing a data set and stress, and this will no longer be a benchmark test. The benchmark test should be as simple as possible, the results are easy to compare with each other, the cost is low and easy to execute. Despite many limitations, benchmarking is very useful (as long as you understand the testing principles and understand how the analysis results represent ). 2.2 There are two main strategies for Benchmark Testing: one is the overall testing of the entire system, and the other is the independent testing of MySQL. These two policies are also called full-stack and single-component benchmark tests. 2.2.1 determine the indicators to be tested before implementation or design a benchmark test. The purpose of the test determines what test tools and technologies are selected to obtain accurate and meaningful test results.

Consider the following indicators and select how to meet the test requirements.

Throughput
Throughput refers to the number of transactions processed per unit time. This has always been a classic test indicator for database applications. This kind of Benchmark Testing mainly targets the throughput of online transaction processing (OLTP) and is very suitable for multi-user interactive applications. The commonly used unit of testing is the number of transactions per second (TPS), and some also use the number of transactions per minute (TPM ).

Response time or delay
This metric is used to test the overall time required by the task. Based on the specific application, the unit of test time may be microseconds, milliseconds, seconds, or minutes. The average response time, minimum response time, maximum response time, and percentage can be calculated based on different time units. The maximum response time is usually not significant, because the longer the test time, the larger the maximum response time. In addition, the results can not be repeated, and different maximum response times may be obtained for each test. Therefore, you can usually use the percentage response time (percentile response time) to replace the maximum response time.

Concurrency
Concurrency is a very important and often misunderstood and misuse indicator. For example, it is often expressed as the number of users browsing a Web site at the same time, and the frequently used metric is the number of sessions. However, HTTP is stateless. Most users simply read the information displayed on the browser, which is not equivalent to the concurrency of Web servers. In addition, the concurrency of the Web server is not the same as that of the database, but only indicates how much data the session storage mechanism can process. The more accurate measurement of Web Server concurrency should be the number of concurrent requests at any time.
The concurrency of the response can be tested at different stages of the application. The high concurrency of Web servers generally results in high database concurrency, but the language and toolset used by the servers have an impact on this. Be sure not to confuse creating database connections with concurrency. A well-designed application that can enable connections to hundreds of MySQL database servers at the same time, but there may be only a few connections executing queries at the same time. Therefore, a Web site may only have 10 to 50000 users at the same time ~ 15 concurrent requests to the MySQL database.
In other words, the concurrency benchmark should focus on the number of concurrent operations at work, or the number of threads or connections at the same time. When concurrency increases, you need to test whether the throughput is down and the response time is longer. If so, the application may not be able to handle the peak pressure.
The measurement of concurrency is totally different from the response time and throughput. It is not like a result, but more like setting a benchmark property. The concurrency test is usually not to test the concurrency that the application can achieve, but to test the performance of the application under different concurrency.Of course, the database concurrency still needs to be measured. You can use sysbench to specify 32, 64, or 128 threads for testing, and then record the Threads_running status value of the MySQL database during the testing.

Scalability
When the business pressure of the system may change, it is necessary to test the scalability. To put it simply, scalability means that, by doubling the work of the system, you can ideally achieve twice the result (that is, doubling the throughput ). Or, you can double the throughput by doubling the system's resources (for example, double the number of CPUs. Of course, the performance (Response Time) must also be within the acceptable range.Most systems cannot achieve such ideal linear expansion. As the pressure changes, both throughput and performance may get worse and worse.
Scalability indicators are very useful for capacity specifications. They can provide information that cannot be provided by other tests to help identify application bottlenecks.
In the final analysis, we should test the most important indicators for users. Therefore, we should collect some requirements as much as possible, such as what response time is acceptable and how much concurrency is expected. Then, we can design benchmark tests based on these requirements to avoid focusing on only some indicators while ignoring other indicators.

2.3 benchmarking method

The following common errors may result in useless or inaccurate test results
1. Use a subset of real data instead of a complete set.
2. Use incorrect data distribution.
3. Use unrealistic distribution parameters.
4. In multi-user scenarios, only single-user tests are performed.
5. Test distributed applications on a single server.
6. It does not match the real user behavior.
7. Execute the same query repeatedly.
8. No check error.After the benchmark test is complete, check the error log. This should be the basic requirement.
9. The system warm-up process is ignored.
10. Use the default server configuration.
11. The test time is too short.
If other conditions are the same, efforts should be made to keep the test process as close as possible to the actual application.

2.3.1 The first step in the design and planning of the benchmark test plan benchmark test is to raise questions and clarify the objectives. Then decide whether to adopt a standard benchmark or design a dedicated test. If a standard benchmark test is used, check that an appropriate test plan is selected. Designing a dedicated benchmark is complex and requires an iterative process. First, you need to obtain a snapshot of the production dataset, And the snapshot can be easily restored for subsequent testing. Then, run the query for the data. You can create a unit test set as a preliminary test and run it multiple times. However, this is different from the real database environment. A better way is to select a representative time period to record all queries on the production system. If you select a small time period, you can select multiple time periods. This helps overwrite the activity status of the entire system.You can query records at different levels. For integrated (full-stack) Benchmark Testing, you can record HTTP requests on the Web server, or open the Query Log of MySQL ). To repeat these queries, make sure that multiple threads are created for parallel execution instead of linear execution by a single thread. An independent thread should be created for each connection in the log, rather than randomly allocating all queries to some threads. The query log records the connection in which each query is executed. Even if you do not need to create a dedicated benchmark, writing down the test plan in detail is also required. The test may run repeatedly, so the test process needs to be precisely reproduced, And the next test is not the same person. Parameters and results should be documented and detailed records must be recorded for each round of tests. Remember that you often need to write some scripts to analyze the test results. Therefore, it is better if you do not need to open documents or perform other operations.2.3.2 how long should the benchmark test run? It is important that the benchmark test run for a long enough time. Sometimes it is impossible to determine how long the test will take, so that the test can run continuously and observe until the system is confirmed to be stable. A simple test rule is to wait until the system looks stable for at least the system warm-up time.A common error test method is to perform only a series of short-term tests, such as a 60-second test, and then summarize the system performance. If you cannot perform a long-term test on your own, it is better to judge it by using the short-term test results.

I/O performance diagram of extended Benchmark Test

2.3.3 obtain the information of the mobile phone tested system as much as possible when performing the benchmark test. Use shell scripts to collect as much data as possible. It is better to use excess data than to use less important data. The data to be recorded includes system status and performance indicators. It is a good habit to collect all the raw data and then analyze and filter it based on it. If data is pre-processed during the collection, and the subsequent analysis finds that more raw data is needed for some exceptions, the system will be "blind. 2.3.4 the best way to obtain accurate test results and obtain accurate test results is to answer some basic questions about Benchmark Testing: Have you selected the correct benchmark? Is related data collected for the problem? Are error testing standards adopted? Then, check whether the test results can be repeated. Make sure that the system status is consistent before each retest. It is even necessary to restart the system and warm up the system. Make sure that the warm-up time is long enough and can be repeated. If a random query is used for pushing, the test results may not be repeated. Each test must ensure that the test data is the same. One way to ensure the distribution of the physical disk data is as always as possible is to quickly format and copy Disk Partitions every time. During each test, the modified parameters should be as few as possible, and sometimes the dependencies between parameters will bring complexity to the test. In general, parameters of the benchmark test are gradually modified through iteration to avoid a lot of modifications to each operation. Testing Based on the default MySQL configuration does not make any sense.If an exception occurs in the test, do not discard it as a bad data point. Analyzing these exceptional results occasionally produces unexpected results. 2.3.5 run the benchmark test and analyze the results. Generally, the automated benchmark test can obtain more accurate test results. All testing processes should be automated as much as possible, including loading data, pushing the system, executing tests, and recording results. The benchmark test usually runs multiple times to improve the accuracy of the test results. After obtaining the test results, you also need to analyze the results to answer questions during the design test. How to abstract meaningful results from data depends on how to collect data. Usually, using scripts to analyze data not only reduces the workload of analysis, but also makes it easier to document. 2.3.6 The importance of plotting collects as much detailed data as possible during the benchmark test and then draws the data into a graph to help you quickly identify problems. 2.4 benchmarking tool 2.4.1 Integrated Testing Tool AB: AB is an Apache HTTP Server Benchmark Testing Tool. It can test the maximum number of requests that an HTTP server can process per second. Http_load: Similar to AB, this tool is designed to test Web servers, but is more flexible than AB. Multiple URLs can be provided through an input file, and http_load randomly selects these URLs for testing. JMeter: JMeter is a Java application that can load other applications and test their performance. JMeter is much more complex than AB and http_load. 2.4.2 Single Component Test Tool Mysqlslap: Mysqlslap can simulate the server load and output timing information. During the test, you can run concurrent connections and specify SQL statements. If no SQL statement is specified, mysqlslap will automatically generate the SELECT statement for schema query. MySQL Benchmark Suite (SQL-Benchmark): Can Be Used for comparative testing on different database servers. It is single-threaded and mainly used to test the server's query speed. The result shows which type of operations are executed faster on the server. Super Smack: Super Smack is a benchmark testing tool for MySQL and PostgreSQL. It provides stress testing and load generation. This is a complex and powerful tool that can simulate multi-user access, load test data to the database, and support filling test tables with random data. Database Test Suite: This is a set of testing tools similar to some industrial standards testing, where dbt2 is a free TPC-C OLTP testing tool (uncertified ). Percona's TPCC-MySQL Tool: A set of benchmarking tools similar to TPC-C, some of which are specifically developed for MySQL testing. Sysbench: Sysbench is a multi-threaded system stress testing tool that can evaluate the system performance based on various factors that affect the performance of database servers. Sysbench is an all-around testing tool that supports hardware testing for MySQL, operating systems, and hardware.
2.5 benchmark test cases (P52 ~ 2.6 summary every MySQL user should understand some benchmark testing knowledge. Benchmarking is not only a time action to solve business problems, but also a good learning method. If you have not performed a benchmark test, we recommend that you be familiar with sysbench at least. You can first learn how to use OLTP and FILEIO for testing. The OLTP benchmark test can easily compare the performance of different systems. On the other hand, the file system and disk Benchmark Testing can effectively diagnose and isolate abnormal components in case of system problems. If benchmarking is performed frequently, it is necessary to specify some principles. Select appropriate testing tools and learn more. You can create a script library for configuring benchmark tests, collecting output results, system performance and status information, and analysis results.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.