MySQL Performance Optimization: 50% performance improvement and 60% latency reduction

Source: Internet
Author: User
When I entered Pinterest, I spent the first three weeks in the headquarters, where the latest project applied the results of solving production problems to the entire software stack. In this section, we learn how Pinterest is built by building Pinterest, and it is not uncommon to submit code and make meaningful contributions in just a few days. At Pinterest, new engineers can flexibly select the group to join, and as part of their work experience in this department, writing different sections of code can help to make this choice. People in this department usually do different projects, while my project is to study MySQL Performance Optimization in depth.
Pinterest, MySQL and AWS, my day!
Our MySQL runs completely in AWS. Although it uses a fairly high-performance instance type (with an ssd raid-0 array), and a fairly simple workload (many single point queries based on primary keys or simple ranges), the peak value is about 2000 QPS, we still cannot achieve the expected IO performance level.
Once the write IOPS exceeds 800, unacceptable latency and replication latency will occur. If the replication lags behind or the read performance of the slave database is insufficient to slow down ETL and batch processing tasks, any group relying on these batch processing tasks will be negatively affected. The only viable option is to either select a larger instance, which will double our overhead, eliminate our efficiency, or find a way to make the existing system run better.
I took over this project from my colleague Rob Wultsch and he has made a very important discovery: when running on AWS SSD, the Linux kernel version is very important. The default 12.04 version carried by Ubuntu 3.2 does not reduce overhead, and the lowest 3.8 version recommended by AWS does not (although 3.8 is more than twice faster than 3.2 ). Running sysbench on an i2.2xlarge (dual ssd raid-0 array) instance and kernel 3.2 barely reaches 100 MB/sec at 16 K random write. Upgrading the kernel to 3.8 will bring us 350 MB/sec in the same test, but this is much worse than expected. Seeing such a simple change causes such an improvement reveals many new problems and guesses about inefficient and poor configuration options: can we get better performance from an updated kernel? Should we change other OS-level settings? Is there any optimization that can be found in my. conf? How can we make MySQL run faster?
I have designed 60 different sysbench file I/O test configurations, with different kernels, file systems, mount options, and RAID block sizes. Once the best configuration is selected from these experiments, I run another 20 or so sysbench OLTP and use other system configurations. The basic test method is the same in all Tests: run the test for one hour, collect data every one second, and then consider caching the warm-up time to remove the data for the first 600 seconds, finally, process the remaining data. After identifying the optimal configuration, we re-built our largest and most important server and put these changes into the production environment.
From 5000 QPS to 26000 QPS: Expands MySQL performance without the need to expand hardware
Let's take a look at the impact of these changes on some basic sysbench OLTP tests. Let's measure the p99 response time and throughput in 16 threads, 32 threads, and several different configurations to see the effect.
Here is the meaning of each number:

 

  • CURRENT: 3.2 kernel and standard MySQL configuration
  • STOCK: 3.18 kernel and standard MySQL configuration
  • KERNEL: 3.18 KERNEL and a small amount of IO/memory sysctl fine-tuning
  • MySQL: 3.18 kernel and optimized MySQL configuration
  • KERN + MySQL: 3.18 kernel and fine-tuning in #3 and #4
  • KERN + JE: 3.18 kernel and #3 fine-tuning and jemalloc
  • MySQL + JE: MySQL configuration in kernel 3.18 and #4 and jemalloc
  • ALL: 3.18 kernel and #3, #4 and jemalloc

 

When we enabled all the optimizations, we found that over 500% read/write throughput was achieved under 16 and 32 threads, while p99 latency of 500 MS was reduced in both directions of reading and writing. In terms of reading, we have increased from around 4100-4600 QPS to over 22000-25000. The score depends on the number of concurrent jobs. In terms of writing, we have grown from approximately 1000 QPS to 5100-6000 QPS. These are the huge space for growth and performance improvement just through some simple changes.
Of course, all these manual benchmarking tests are meaningless if they cannot be converted into actual results. The following figure shows the latency on our main cluster from the client and server perspectives, ranging from a few days before the upgrade to a few days after the upgrade. It took a week to complete the process.
The red line indicates the latency the client feels, and the green line indicates the latency measured by the server. From the client side, p99 latency drops from a fluctuating, sometimes over 100 ms of 15-35 ms to a fairly stable 15 ms, sometimes with 80 ms or lower outliers. The latency measured by the server is also reduced from 5-15 ms to 5 ms, which has a daily tip value of 18 ms caused by system maintenance. In addition, since the beginning of the year, our peak throughput has increased by 50%, so we are not only processing a considerable amount of load (still in our estimated capacity ), at the same time, we also have better and more predictable throughput. Furthermore, the good news for everyone who wants to have a good sleep is that, page feed events related to system performance or general server load have dropped from March in 300 to the two months in April and May.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.