Disk IO high and thread switching over high performance voltage measurement case studies

Source: Internet
Author: User
Tags switches

Case phenomenon:

When the pressure test, found a request pressure 80tps, the CPU occupied is very high (24-core machine, each CPU occupation rate of the total soared to more than 80%), and set the checkpoint did not have any error.

1. The top command is as follows:

2, understand the background to implement the logic: The general is such: after the server received the request, will be another KV server request data, take back the data, according to the user's machine code to do personalized operation, and finally return the results to the client, during the output some debug log.

Check the next, the KV server is normal, the description is the problem of the native service server. Specifically, use the Vmstat command to see where the anomaly is.

3, it can be seen intuitively, Bi, Bo, in, CS the values of the four items are very high, according to experience, BI and bo for disk IO-related, in and CS on behalf of the system process-related. One solution, first look at IO.

4, with the iostat–x command to read the disk read and write, sure enough, the disk slowly to block the dead.

5, read the next process, only write log operation can cause frequent read and write disk. Decisively close log. Re-crackdown on the test.

6, Bi and Bo down to normal, indicating that the disk problem solved. But the number of context switches actually reached 400,000 times per second! It's horrible.

7, only know that the number of context switches is very large, how to know which processes to switch between?

a script was searched on the internet, which was used to count the top20 of the process switching in a given time and print it out.

#! /usr/bin/Env stap##GlobalCsw_countGlobalidle_countprobe Scheduler.cpu_off {csw_count[task_prev, Task_next]++Idle_count+=idle}function fmt_task (Task_prev, task_next) {returnsprintf"%s (%d)->%s (%d)", Task_execname (Task_prev), Task_pid (Task_prev), Task_execname (Task_next), Task_pid (Task_next))} function Print_cswtop () {printf ("%45s%10s\n","Context Switch","COUNT")foreach([Task_prev, Task_next]inchCsw_count-limit -) {printf ("%45s%10d\n", Fmt_task (Task_prev, Task_next), Csw_count[task_prev, Task_next])} printf ("%45s%10d\n","Idle", Idle_count) Delete Csw_countdelete idle_count}probe timer.s ($1) {print_cswtop () printf ("--------------------------------------------------------------\ n")}

After saving to CS.STP, execute with STAP CSWMON.STP 5 command.

8, the discovery is the discover process in the repeated and the system process to switch. This consumes a lot of resources.

9, from the online search for some of the ways to reduce the switching process:

The development was then changed: the number of threads was doubled and controlled in a process.

Re-suppressed a bit. The number of context switches was found to be reduced to about 250,000 times.

The performance data at this time can reach about 260 times per second, much higher than the previous 80 times. Have reached the need to go live.

However, due to the high number of page break books and context switches, it is necessary to optimize the following

Disk IO high and thread switching over high performance voltage measurement case studies

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.