MySQL Slave triggers oom-killer solution, slaveoom-killer
Recently, I often receive alarms similar to insufficient memory for MySQL instances. I log on to the server and find that MySQL has consumed 99% of the memory, God!
Sometimes the kernel will help us restart MySQL without timely processing, and then we can see that the dmesg information has the following records:
Mar 9 11:29:16 xxxxxx kernel: mysqld invoked oom-killer: gfp_mask = 0x201da, order = 0, oom_adj = 0, oom_score_adj = 0
Mar 9 11:29:16 xxxxxx kernel: mysqld cpuset =/mems_allowed = 0
Mar 9 11:29:16 xxxxxx kernel: Pid: 99275, comm: mysqld Not tainted 2.6.32-431. el6.x86 _ 64 #1
Mar 9 11:29:16 xxxxxx kernel: Call Trace:
Describe the specific scenario:
Prerequisites: Operating System and MySQL version:
OS: CentOS release 6.5 (Final) Kernel: 2.6.32-431. el6.x86 _ 64 (physical machine)
MySQL: Percona 5.6.23-72.1-log (Single Instance)
Triggering scenario: No matter whether there are other links in Slave, the memory will surge cyclically, triggering the kernel oom-killer
It is said that this problem has been around for more than a year. The boss asked me to Check whether I could find any clues, so I began to Check the problem:
1. I suspect that the memory allocated to MySQL is unreasonable, so I checked the innodb_buffer_pool size and physical memory size, and found that the size allocated to BP accounts for about 60% of the physical memory, this is not the reason. If it is the problem, it should have been discovered for a long time ~
2. Check the parameter configurations of the operating system. [Vm. swappiness = 1;/proc/sys/vm/overcommit_memory; oom_adj] You can temporarily set the adj parameter to-15 or-17 before troubleshooting, in this way, the kernel will never kill mysql, but this will not solve the problem at all, and there is a certain risk, will it cause MySQL to need memory and cannot be allocated and hang will survive? Let's just think about this method.
3. Well, the initialization parameters and operating system parameters of mysql do not seem to have any improper configuration. Let's look for MySQL!
Since MySQL memory has been soaring, will it be caused by memory allocation? A Bug caused by MySQL memory allocation is reported on the Internet, I will also operate in my environment to see if: 1. record the memory occupied by the current MySQL process; 2. record show engine innodb status; 3. execute flush tables; 4. record show engine innodb status; 5. record the MySQL process usage. 6. Compare the two results to see whether the memory allocated by MySQL changes significantly before and after the Flush table execution. Well, this bug does not seem to me anymore.
I have read about the innodb_buffer_pool_instances parameter in this version. on the official website, innodb_buffer_pool_instances and innodb_buffer_pool_size are incorrectly set, which may cause the MySQL OOM bug: we can set innodb_buffer_pool_size to be larger than our actual physical memory. For example, our physical memory is 64 GB, while we set innodb_buffer_pool_size = 300 GB, and set innodb_buffer_pool_instances> 5, we can still pull MySQL up. However, MySQL is easy to use OOM. What is http://bugs.mysql.com/bug.php? Id = 79850.
In other cases, a BUG is reported, that is, when slave is set to filter, OOM is also triggered, but I have not set these instances, so ignore this.
Since it is not caused by over-sale of MySQL memory, it is not caused by opening the table handle. So what's the reason?
Let's think about it again. This phenomenon occurs in the Slave. The Master and Slave configurations are the same, but the Master runs the production business, and some instances on the Slave run the query business, some instances do not run any tasks at all, but they still start out of OOM. This is probably the Slave caused by Slave.
Then I went to an instance and tried it. I don't know if I didn't try it. I was shocked. Run the following command: stop slave; start slave; the command is stuck for about 3 minutes. Then, the memory usage is checked, and 20 GB + is released. The problem is basically located here, but we all know Slave has two threads. Is it caused by SQL Thread or IO Thread? This still waits for further troubleshooting when it is about to happen next time.
Monitoring Information of the pasting memory:
12:00:01 kbmemfree kbmemused % memused kbbuffers kbcached kbcommit % commit
02:40:01 566744 131479292 99.57 88744 618612 132384348
02:50:01 553252 131492784 99.58 83216 615068 132406792
03:00:01 39302700 92743336 70.24 95908 925860 132413308
03:10:01 38906360 93139676 70.54 109264 1292908 132407836
03:20:01 38639536 93406500 70.74 120676 1528272 132413136
I 've recorded something a little more specific here: https://bugs.launchpad.net/percona-server/?bug/1560304if you can't find anything)
Summary:
Symptom: Slave OOM
Temporary solution: restart Slave
Long-term solution: Upgrade MySQL Server in minor versions
For more information about the system, see Guo Zong's:
Http://www.bkjia.com/article/88726.htm
Http://www.bkjia.com/article/88727.htm