Background:
Recently encountered a depressed problem: Obviously the OS still have a lot of free memory, but it happened to swap, think no solution.
Take a look at what swap is for, and learn about its background.
Under Linux, swap acts like "virtual memory" under Windows systems. When physical memory is low, take out some of the hard disk space when the swap partition (virtual into memory) is used to resolve insufficient memory capacity.
Swap means an exchange, as the name implies, when a process requests memory from the OS to be found to be insufficient, the OS swaps the data that is temporarily unused in memory and places it in the swap partition, a process known as swap out. When a process needs this data and the OS discovers that there is free physical memory, the data in the swap partition is swapped back into physical memory, a process known as swap in. In the output of Vmstat, it is shown as si\so two columns, such as. 1
Figure 1
See here we know that the most direct possible cause of the swap is the process to the OS application memory, the discovery of physical memory is not enough, when no swap is available, it may be waiting, may also trigger the Oom-killer mechanism, The OS kill the process that consumes the most memory to free up memory, depending on the kernel parameter vm.swappiness. This parameter can be selected from 0-100, set to 0 is to maximize the use of physical memory, try not to use swap, set to 100 is the desire to actively use swap. On the server running the database process, we usually strongly recommend that this value is less than or equal to ten ( see vm.swappiness = ten), preferably set to 0. The reason is very simple, for the database this need to concentrate CPU resources, large memory, high I/O programs, if the swap partition instead of memory, the database service performance will be unacceptable, but rather than directly by Oom Kill (database process is usually the most memory, the most likely to be oom Kill) to the happy (dead late dead are dead, it is better to die, anyway, will soon be reborn, eh).
First introduced so much information, we must have been impatient, we will take a look at the scene and the investigation.
Onsite Troubleshooting
First, look at the overall status of the system, can see what, there are several key information:
The system load is not high, the last average load is 6.8;
CPU load is not high, there is a lot of idle, idle for 98.4%;
Memory is allocated primarily to the MYSQLD process, taking up 80.2%;
Although the physical memory is 256G, the idle will be nearly 39G, but the swap does occur and the swap is exhausted.
Get the first troubleshooting result: physical memory is still a lot of free, but the swap is exhausted. As an experienced DBA, what is the first reaction to this situation? Well, don't dissected, just keep looking down.
Then execute FREE-GT to see the memory, swap consumption, as shown in. 3
Figure 3
See it: In this case, the first reflex is very direct: A memory leak occurred (leak).
In general, if cached and used are found to be significantly different in memory statistics, it is essential to determine that the system has a memory leak. The corresponding treatment methods are:
WORKAROUND: Timing Restart process, completely release the memory back to the OS;
The solution: Find the code that leads to the leaking code, fix it (we are facing the MySQL code, or go to the official submission of bugs, haha);
The solution to the problem: Upgrade the program version, usually the new version will resolve the existing problems, recommend this scenario.
And look at how the memory-related options in MySQL are configured:
Figure 4
In addition to the innodb-buffer-pool assigned a little more, the others are normal. Looked under, MySQL version is 5.6.19, it seems to be necessary to upgrade to the latest version of the 5.6 series.
Here we get the second result: the MYSQLD process has a memory leak, it is recommended to timing restart the process, and schedule the upgrade to the latest version as soon as possible.
However, just because of the mysqld process memory leak caused by swap, seemingly not completely? Remember, we have a place up here that's not dissected. No: Physical memory still has a lot of free, but the swap is exhausted. Most of the situation is caused by not shutting down Numa. On the server running the database process, it is strongly recommended to turn off Numa, which is also mentioned in the previous sharing of the more comprehensive MySQL optimization reference (the previous article). Let's take a look at the NUMA situation:
Figure 5
Figure 6
From the above figure. 5.6, the NUMA problem causes one of the CPUs to allocate much less memory than the other (1.8G vs 38G), so this CPU is obviously not enough to apply for large memory, so swap occurs. Background knowledge about NUMA I don't repeat it here.
Therefore, we get a third result: Swap is not turned off due to improper server hardware and system setup. Suggested scenarios are:
Turn off NUMA at the BIOS setup level, with the drawback that the OS needs to be restarted;
or modify the Grub configuration file, the disadvantage is to restart the OS;
Upgrade MySQL version to 5.6.27 and later, a new option innodb_numa_interleave, only need to restart the Mysqld instance, no need to restart the OS, recommend this scenario.
Here, the problem has been basically analyzed clearly, the relevant solutions are given, according to their own situation to assess the choice of which program can be.
[Go] Find out why MySQL has a swap