Record a _ lll_lock_wait_private error. a DBA colleague found the program hang when executing a command line tool yesterday. The problem is quite interesting and worth recording.
First, I used pstack to view the call stack of the program. this is a multi-threaded program. The pstack result shows that almost all threads are waiting for the write call. The output result of pt-pmp is as follows:
Tue May 27 18:30:06 CST 2014 55 __lll_lock_wait_private,_L_lock_51,fwrite,LoadConsumer::run,CThread::hook,start_thread,clone1 write,_IO_new_file_write,_IO_new_file_xsputn,buffered_vfprintf,vfprintf,fprintf,LoadManager::dump,LoadProducer::load_file,LoadProducer::run,CThread::hook,start_thread,clone
Intuitively, it seems that the disk space is full, so he can see that the disk space is still rich and there is no problem with the file creation by touch. at that time, a backup program was still running on the machine, i/O pressure is not small, but it should not be related to the problem itself. Google's _ lll_lock_wait_private error does not have any useful information. The tool program itself will output many program execution results to a log file specified by the command line, and the execution state of the program will be output to stderr. gdb attach looked at the specific call stack of the program, all programs are blocked on fprintf (stderr. After consulting a colleague, he used this tool to call it through a python script. the called program is similar to the following:
p = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT,close_fds=True)
Cmd is the command line for calling a program, including a series of options. The problem looks clear. when you call this command line tool through python, stdout and stderr are redirected to subprocess. PIPE, but no program reads from this PIPE, so soon the buffer of this PIPE is full, and pstack sees that all writes are blocked.
Write a small program to reproduce the program, the program is as follows (with handwriting ..):
1 #include
2 #include
3 #include
4 void *thr_fn(void *arg)5 {6 int i =0;7 while(true)8 {9 i++; 10 fprintf(stderr, "helloworld %d/t/t/t/t",i); 11 fprintf(stdout, "kkkkkkkkkk %d/t/t/t/t",i); 12 sleep(1); 13 } 14 } 15 16 int main(void) 17 { 18 for (int i = 0; i!= 50; i++) 19 { 20 pthread_t tid; 21 pthread_create(&tid, NULL, thr_fn, NULL); 22 } 23 24 sleep(100000); 25 }
The python script called is as follows:
1 import subprocess2 import os3 import time45 ret = {}67 cmd = "./a.out >/tmp/log"89 p = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT,close_fds=True) 10 11 ret['status'] = p.wait() 12 ret['msg'] = p.stdout.readlines() 13 14 time.sleep(100000000);
Execute this python script and you will find that a. out will soon be hang, which means/tmp/log will no longer have new output. The program call stack is as follows:
49 __lll_lock_wait_private,_L_lock_12956,buffered_vfprintf,vfprintf,fprintf,thr_fn,start_thread,clone1 write,_IO_new_file_write,_IO_new_file_xsputn,buffered_vfprintf,vfprintf,fprintf,thr_fn,start_thread,clone1 nanosleep,sleep,main
Solve this problem. 1) Can the program parameter maybe subprocess. Popen be changed? 2) or rewrite stdout/stderr in cmd; 3) when writing a command line program, note that all the scripts to be called can be written. therefore, try to standardize the log writing, do not output too many things to stdout/stderr.
Colleagues in the group added the following common scenarios for MySQL to encounter the _ lll_lock_wait_private error:
"The most typical scenario of this function call in mysql is that this is often encountered when the cgroup is enabled, such as memcpy, mem alloc, free, mutext lock/unlock ...."