Analysis of core-playing problem of crontab in Linux

Source: Internet
Author: User
Tags join ssh

Received a colleague to be disabled, the program again crash, but can not find core, and should produce core, the system is sure where there are bugs.

First of all, from the system level to check out:

1. Find the other program's PID number, then Cat/proc/pid/limits |grep Core, back to Max core file size 0 Unlimited bytes, there are large Problem, Soft limit value is incredibly 0, continue to find the reason.

2. Execute ulimit-c with program user, return unlimited, System setup is no problem.

3. Sysctl-a |grep core_, view Core_pattern return core.%p.%e, no directory presence or permissions issues.

4. SSH XX. Xx. Xx. XX "ulimit-c", remote login Execution command, return Unlimited, ruled out the monster world I said before does not play the core problem.

Basic questions have been checked, seemingly no problem, but did not hit the core, where is the problem? Tangled. Again, check it again.

Still haven't found the problem, but some servers and programs and feedback is normal .... (Omit some of the hard work), the previous article said that the first Ulimit-c Unlimited, and then restart the sshd, so ssh over the connection also default has this parameter, that is to say, the Ulimit value is 0, most likely also from the previous level "parent program" inherited to. Look at the start of the program at the same point in time, look for Bash records and Cron Records, no one found who manually set the Ulimit core value and then start the game and so on, only Cron records restart the game. is crontab problem, Ps-ef |grep Crond find crontab Process ID number, and then go into the proc directory to view his limits file, found that the program is really 0 value, that is, Crond was set to not hit the core, Then any program he launches, unless manually stated, will not hit the core, and then simply verify:

In Crontab to join the scheduled task run/lib64/libss.so.2, this is a library file, direct operation will hit the core error, but put in crontab inside did not hit the core, that is the problem: crontab caused by not playing the core problem.

But why, why crond This program is not set to play core, it is logically, will inherit bash's unlimited value, unless you set up another, remember/etc/crontab can set Crond variables, And then manually specify the unlimited value or the same in it. Invalid, view the/etc/init.d/crond startup file and find that it will also load/etc/sysconfig/crond this settings file, and then manually specify the result or invalid. Set "breakpoints" in several places in the/etc/init.d/crond file and join cat/proc/$$/limits| grep core, the results are returned unlimited value, but also a confusion.

Continue to set "breakpoint", the result discovers some key when Daemon starts Crond, then look for daemon Source:/etc/rc.d/init.d/functions file, found the definition of daemon function inside, And there's a key command 239 line has a corelimit= "ulimit-s-C ${daemon_corefile_limit:-0}", and daemon_corefile_limit this variable has no place to set, The idea of setting up SNMP before the log problem, also in the/etc/sysconfig/crond to join daemon_corefile_limit=unlimited and then restart Crond on it. But this is for crontab start program, in addition we often have some programs are started with the service, and the service will be loaded by default/etc/rc.d/init.d/functions This file, if you want to completely eliminate the use of system services to start the program does not hit the core of this situation, The corelimit= "Ulimit-s-C ${daemon_corefile_limit:-0}" can be directly commented out of the line.

Reset crontab, run/lib64/libss.so.2 This file, found crash successfully played core, Crontab started the program has been able to hit the core, seemingly the problem to solve this.

But the question comes again, why some servers some project program also uses crontab to restart the program, but does not have this situation? Tragedy, keep looking for problems ... (Business problem, omitted), problem found, because some programs are another parent program Guardian, crontab restart those games simply kill the program, and then the parent program automatically started, and then the parent program is manually started, can be normal play core, so there is no this problem. Logically, this situation has been explained well, but there are exceptions, there is a project is not started with the Guardian, but there is no such bug, explanation, tragedy, continue to find problems.

The whole process is almost over again, but there's still no explanation. To float an idea, before the beginning of the suspect development bosses, whether in the program manually set the Ulimit value, but said no, if it does not, that will explain the impassability, and then intend to do their own binary file for a simple analysis:
On the server through the strings command to see if there are text such as Ulimit and core, the result is really found, there are set Corefile size text, it must be the problem, continue to find the project developers understand the situation, there are statements, They include the Setrlimit function in the program, which is used to set the upper bound of the resource used by the task process. That's the problem. Because it is a program or subroutine that you start for yourself, you can also set this value for ordinary users.

OK, there are 3 dozen core issues all explained through, a little summary:

1. When the crontab starts from the system function to get the core value of 0, all programs that start up are not hit core.

2. The daemon has core enabled, and the program that he starts inherits this attribute directly.

3. The program can set its own core value separately.

In addition, to see if the running program can produce core, not let him crash off, but through the proc file system inside the PID folder limits file. This is the focus of a previous article. Powered by Liu.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.