Last night, Ops said there was a production server with a MySQL CPU that was 100%, and the new client could not log on, but the apps that were already running are available.
After logging on to the server, Top-h looked down, where one thread of the CPU had been 100%, and the others were almost idle.
MySQL thread ID 14560536, OS thread handle 0x7f1255ef1700, query ID 31889137761 10.26.124.8 OSM cleaning up
Top-18:56:26 up, 3:55, 3 users, load average:1.08, 1.13, 1.18
tasks:1503 Total, 2 running, 1501 sleeping, 0 stopped, 0 zombie
Cpu (s): 12.9%us, 0.2%sy, 0.0%ni, 86.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
mem:16333644k Total, 15440672k used, 892972k free, 45652k buffers
swap:4153340k Total, 0k used, 4153340k free, 1050524k cached
PID USER PR NI VIRT RES SHR S%cpu%MEM time+ COMMAND
15669 MySQL 0 8338m 5.9g 5808 R 100.0 38.1 162:29.18 mysqld
29679 rabbitmq 0 4315m 154m 1780 S 1.3 1.0 1162:54 BEAM.SMP
3887 Root 0 16092 2360 940 R 0.7 0.0 0:00.27 Top
426 Root 0 68020 6136 4244 S 0.3 0.0 0:03.34 Aliyundun
427 Root 0 68020 6136 4244 S 0.3 0.0 0:02.62 Aliyundun
430 Root 0 68020 6136 4244 S 0.3 0.0 0:08.37 Aliyundun
432 root 0 68020 6136 4244 S 0.3 0.0 0:05.34 Aliyundun
1247 Root 0 245m 49m 1604 S 0.3 0.3 28:40.31 Alihids
1249 Root 0 245m 49m 1604 S 0.3 0.3 10:17.96 Alihids
3653 Root 0 113m 1572 932 S 0.3 0.0 0:00.11 Wrapper
12293 Root 0 9722m 1.3g 5940 S 0.3 8.6 20:10.09 Java
16457 Root 0 8710m 166m 5772 S 0.3 1.0 1:52.67 Java
16914 Root 0 9864m 168m 5808 S 0.3 1.1 19:30.97 Java
29680 rabbitmq 0 4315m 154m 1780 S 0.3 1.0 578:21.03 BEAM.SMP
1 root 0 19356 S 0.0 0.0 0:02.80 Init
2 Root 0 0 0 0 S 0.0 0.0 0:00.03 Kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:15.65 migration/0
4 Root 0 0 0 0 S 0.0 0.0 0:32.07 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0
6 root RT 0 0 0 0 S 0.0 0.0 0:05.98 watchdog/0
This problem has occurred nearly 10 cretin, except that before one or two times other production systems, most of which occurred in the development environment, has not found the reason, once kill this thread, causes the MySQL instance to kill and automatically restarts.
Because it happened before, did not solve, so just the server's Innodb_monitor open, but according to the time to back to the almost problem of the moment, only to see a transaction from the problem began to continue to run for almost more than 20 minutes, as follows:
FILE I/O
--
MySQL thread ID 14560536, OS thread handle 0x7f122db3c700, query ID 31889113153 10.26.124.8 OSM cleaning up
---TRANSACTION 604639618, ACTIVE 1248 sec
MySQL thread ID 14561064, OS thread handle 0x7f122db3c700, query ID 31887588399 125.118.111.42 OSM cleaning up
--------
FILE I/O
--
MySQL thread ID 14560536, OS thread handle 0x7f122db3c700, query ID 31889113154 10.26.124.8 OSM cleaning up
---TRANSACTION 604639618, ACTIVE 1268 sec
MySQL thread ID 14561064, OS thread handle 0x7f122db3c700, query ID 31887588399 125.118.111.42 OSM cleaning up
--------
FILE I/O
--
MySQL thread ID 14560536, OS thread handle 0x7f122db3c700, query ID 31889113373 10.26.124.8 OSM cleaning up
---TRANSACTION 604639618, ACTIVE 1288 sec
MySQL thread ID 14561064, OS thread handle 0x7f122db3c700, query ID 31887588399 125.118.111.42 OSM cleaning up
Because all of the system configuration parameters, MySQL are unified, OS version, kernel version are consistent, so should not be the version and some parameters caused by the problem.
With the relevant operations to confirm, only to say that the GUI was connected, but did not do slow operation such as directly through the NAVICAT export operations. So the problem is more like the memory problem that was encountered before, it seems like never to come out, but some time out.
Strace and Ltrace also did not see what the problem, at that moment is quite depressed ... Toss to later, mysqld don't know how to be careless killed.
Back in the morning, continue research, search Bing half a day, finally in MySQL Buglist found is 5.5.23 before there is a performance_schema bug, with our phenomenon very similar, as follows:
http://bugs.mysql.com/bug.php?id=64491
Http://lists.mysql.com/commits/143350?f=plain
One thread call Sched_yield (), but the other threads is ' t ready run. So cause CPU 100%.
But according to the developer said this problem in 5.5.23 and 5.6.6 has been fixed, we use Percona server 5.6.31, theoretically should not appear this bug.
Because there is no scene at the time, can only wait for the next time gdb up to see, the problem of the thread in the execution of what operation, or non-C + + Origin GDB is not familiar caused by, hey ... This performance optimization of the Gdb/wireshark really familiar with the memorized, or some strange phenomenon light through the server itself to provide the API and CLI and some advanced tools really can't locate the reason AH.
MySQL one thread CPU 100%