Problem Description:
2010-2-25 found the video cannot access the situation, view Nginx error log found worker process 28541 exited on signal 11
Alert, the process dies after rebuilding:
#more Error.log
2010/02/25 15:35:48 [alert] 28537#0:worker process 28541 exited on signal 11
2010/02/25 15:35:49 [alert] 28537#0:worker process 28540 exited on signal 11
2010/02/25 15:35:49 [alert] 28537#0:worker process 28538 exited on signal 11
View DMESG also found an error
#dmesg
[8899569.894983] nginx[31582]: Segfault at 1 IPs 080727a8 SP bf8c1d00 error 4 in nginx[8048000+83000]
[8919934.402677] nginx[31604]: Segfault at 1 IPs 080727a8 SP bf8c1d00 error 4 in nginx[8048000+83000]
[8919935.259635] nginx[31225]: Segfault at 1 IPs 080727a8 SP bf8c1cc0 error 4 in nginx[8048000+83000]
Worker process 20437 exited on signal 11 error Reason: Segment illegal error, most of these errors are illuminated by application errors.
SIGSEGV Creating a core file segment illegal error
SIGSEGV attempts to access memory that is not allocated to itself, or attempts to write data to a memory address that does not have write permissions.
Solution Ideas:
1. Reproduce the error and simulate the same error in the form of manual simulation.
2. Debug with GDB based on the core file generated by the system at the time of the error.
3. According to the log there are 2 in the search engine spider access, this error occurred, may be related to spiders.
Problem Solving Process:
1. First restarted the Nginx, killed some pop3-login process. Able to access, the next day to find logs and errors.
2. Starting to think that the segment error segmentation fault is due to insufficient memory, because Linux memory is managed differently than windows, so it frees up memory manually. After some time or error. Online has said is the problem of the program, upgrade on the solution, according to the Nginx from the current 0.7.64 upgrade to February 1 just released the Nginx 0.7.65,nginx 0.7.65 also has about segmentation fault bugfix:
*) bugfix:a segmentation fault might occur in a worker process, if
Limit_rate is used in HTTPS server.
Thanks to Maxim Dounin.
*) bugfix:a segmentation fault might occur in a worker process while
$limit _rate logging.
Thanks to Maxim Dounin.
I thought this time can be solved, unfortunately, the problem is still.
3. Since the version does not matter, Nginx is compiled by themselves, the time of compilation added secure_download and MP4 drag module. Remove the module and try it out, the problem is still unresolved after removing the MP4 module. And the anti-theft chain module removed, observed a day more did not appear problems, the basic can determine the problem of anti-theft chain module.
4. Since it is the problem of the program, it involves debugging procedures, Debian, of course, with GDB, so learn the use of GDB, the tune has not been transferred out (programming is not good, debugging program is even less), a little discouraged. In the expert's guidance, the simulation environment migrated to the intranet, an accidental situation, simulated the worker process 28541 exited on signal 11 error: The intranet opened the AutoIndex, in the click to join the anti-theft chain directory, reproduce the error.
5. The final check is that the anti-theft chain in the configuration file is not turned on secure_download_fail_location/fld; (i.e., when the request is wrong, directed to the error page), resulting. It is also concluded that the search engine spiders crawl to the encryption section, not get the correct path, and not be targeted to the error page.
6. Because this is a file server, do not need to crawl to the spider, so in the directory with the Robot.txt file to prohibit the search engine spider crawl.
The cause of the problem: his work omission led to this problem, thanks to the company's business did not cause losses, this time to learn: operation and maintenance work is not too careful to be careful.
Summary: Learn in this error:
1.linux of signal volume and its effect.
How 2.linux memory is managed.
3. After the program error, the core file will be generated (default will not be produced, need to be opened manually), according to this file debugging program.
Basic usage of 4.GDB