These days the server has been automatically restarted from time to time, with a focus on the database of our production environment. Contact Dell manufacturers to troubleshoot and not find hardware problems, the system restarts the process without any shutdown information, like a direct power-down, but the Idrac Web management interface can be opened to prove that the network and power is not a problem. and Idrac log records do not have any error messages, so continued tragedy for a few days ...
Server model: DELL R430 Operating system: CentOS 6.5 kernel version: 2.6.32-431.el6.x86_64
In the back of a lot of logs finally found the following error:
sep 29 19:06:22 data-1 kernel: acpi error: No handler for Region [SYSI] (ffff88047a7cf390) [IPMI] (20090903/ evregion-319) SEP 29 19:06:22 DATA-1 KERNEL: ACPI ERROR: REGION IPMI (7) has no handler (20090903/exfldio-295) Sep 29 19:06:22 data-1 kernel: ACPI Error (psparse-0537): method parse/execution failed [\_sb_. pmi0._ghl] (NODE FFFF88047A7CED30), ae_not_existsep 29 19:06:22 data-1 kernel: acpi error (psparse-0537): method parse/execution failed [\_sb_. pmi0._pmc] (node ffff88047a7ce1a0), ae_not_existsep 29 19:06:22 data-1 kernel: acpi exception: ae_not_exist, evaluating _pmc (20090903/power_meter-759)
This is a bug in kernel, the problem is that it will not cause a reboot immediately, but will automatically restart if the server has not restarted within more than 200 days, like a time bomb.
Workaround :
Upgrade kernel version
I was upgrading to 2.6.32-642.4.2.el6.x86_64 problem solving.
This article is from the "Billy98 blog" blog, make sure to keep this source http://billy98.blog.51cto.com/1285143/1858190
Linux run more than 200-day automatic restart