[sh] <defunct>等殭屍進程,導致系統非常緩慢(ORA-00445)

來源:互聯網
上載者:User

今天午休時間,接到一個請求。系統非常緩慢,且從top看,進程多為殭屍進程

Oracle版本是11.2.3.x

Linux5.6 x86-64


看到殭屍進程,第一首先懷疑 cron 裡面的問題

通過crontab -l ,crontab -u oracle -l

發現沒有任何人任務

類似這種問題,可以在crontab 中每行後面加個 > /dev/null 2>&1

參考http://blog.sina.com.cn/s/blog_6226824c01014ee8.html

之後通過pstree -ap 

查看到 這些殭屍進程是由psp0,cjq0,mmon,smco這些進程引起的

在metalink裡 搜尋 defunct processes, 以及對比alert裡面的資訊。定位到了一個文檔

ORA-00445: Background Process "xxxx" Did Not Start After 120 Seconds (文檔 ID 1345364.1) 轉到底部
修改時間:2014-2-9類型:PROBLEM

In this Document

  Symptoms
  Changes
  Cause
  Solution


APPLIES TO:Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.1.0.1 [Release 11.2 to 12.1]
CRM On Demand - Version N/A to N/A
IBM: Linux on System z
Linux x86-64
Linux x86
SYMPTOMS

Errors are seen in the alert log relating to spawning of processes such as:

@ Checked for relevance on 17th Jan 2012
ORA-00445: background process "m001" did not start after 120 seconds
Incident details in: /opt/u01/app/oracle/diag/rdbms/incident/incdir_3721/db1_mmon_7417_i3721.trc
ERROR: Unable to normalize symbol name for the following short stack (at offset 2):
Tue Jun 21 03:03:06 2011
ORA-00445: background process "J003" did not start after 120 seconds


or

Waited for process W002 to initialize for 60 seconds



The system appears to be running very slowly and defunct processes can appear.

CHANGES

REDHAT 5 kernel 2.6.18-194.el5 #1 SMP Tue Mar 16
Oracle 11.2.0.2 Single Instance
IBM: Linux on System z

CAUSE

Recent linux kernels have a feature called Address Space Layout Randomization (ASLR).
ASLR  is a feature that is activated by default on some of the newer linux distributions.
It is designed to load shared memory objects in random addresses.
In Oracle, multiple processes map a shared memory object at the same address across the processes.

With ASLR turned on Oracle cannot guarantee the availability of this shared memory address.
This conflict in the address space means that a process trying to attach a shared memory object to a specific address may not be able to do so, resulting in a failure in shmat subroutine.

However, on subsequent retry (using a new process) the shared memory attachment may work.
The result is a "random" set of failures in the alert log.

 

SOLUTION

It should be noted that this problem has only been positively diagnosed in Redhat 5 and Oracle 11.2.0.2. 
It is also likely, as per unpublished BUG:8527473,  that this issue will reproduce running on Generic Linux platforms running  any Oracle 11.2.0.x. or 12.1.0.x  on Redhat/OEL kernels which have ASLR. 

This issue has been seen in both Single Instance and RAC environments.

ASLR also exists in SLES10 and SLES 11 kernels and by default ASLR is turned on.  To date no problem has been seen on SuSE servers running Oracle  but Novell confirm ASLR may cause problems.  Please refer to

http://www.novell.com/support/kb/doc.php?id=7004855 mmap occasionally infringes on stack

You can verify whether ASLR is being used as follows:

 # /sbin/sysctl -a | grep randomize
kernel.randomize_va_space = 1

If the parameter is set to any value other than 0 then ASLR is in use.

On Redhat 5 to permanently disable ASLR.

add/modify this parameter in /etc/sysctl.conf
kernel.randomize_va_space=0
kernel.exec-shield=0

You need to reboot for kernel.exec-shield parameter to take effect. 

Note that both kernel parameters are required for ASLR to be switched off.

 

There may be other reasons for a process failing to start, however, by switching ASLR off, you can quickly discount ASLR being the problem. More and more issues are being identified when ASLR is in operation.





通過設定兩個核心參數,關閉了一個叫ASLR的linux新特性,之後對伺服器進行了重啟,現在系統復原了正常。有待於繼續觀察

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.