/limits.conf Oracle bug引起的進程不夠用

來源:互聯網
上載者:User

標籤:rac grid oracle

今天在檢查SMIDB的時候,發現CRS的警示日誌中出現很多錯誤,具體為:

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />2015-08-19 17:12:21.745: 

[/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"2015-08-19 17:13:09.986: [/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"2015-08-19 17:13:21.758: [/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"

進一步追蹤記錄檔發現:

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

2015-08-19 17:14:09.993: [ora.LISTENER.lsnr][1342174976]{1:63186:26462} [check] clsn_agent::check: Exception SclsProcessSpawnException2015-08-19 17:14:21.744: [ora.asm][1342174976]{0:21:2} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 02015-08-19 17:14:21.759: [ora.asm][1342174976]{0:21:2} [check] AsmProxyAgent::check clsagfw_res_status 02015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] Utils:execCmd action = 3 flags = 38 ohome = (null) cmdname = lsnrctl. 2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] (:CLSN00008:)Utils:execCmd scls_process_spawn() failed 12015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] (:CLSN00008:) category: -2, operation: fork, loc: spawnproc28, OS error: 11, other: forked failed [-1]2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] clsnUtils::error Exception type=2 string=CRS-5013: Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"


ONS的日誌:

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

[[email protected] logs]$ tail ons.out pthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailablepthread_create() Resource temporarily unavailable[2015-05-07T03:09:22+08:00] [ons] [TRACE:2] [] [internal] ONS worker process stopped (0)


報這個錯誤說明是由於系統資源不足而導致的進程無法啟動,檢查ulimit設定

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

[[email protected] logs]$ ulimit -u10240

limit.conf

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

# End of filegrid soft nproc 10240grid hard nofile 65536oracle soft nproc 10240oracle hard nofile 65536

limit.conf配置有一些問題,沒有配置hard  nproc 和 soft nofle,下周一重啟前進行修正

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

[[email protected] pam.d]$ cat login #%PAM-1.0auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.soauth       include      system-authaccount    required     pam_nologin.soaccount    include      system-authpassword   include      system-auth# pam_selinux.so close should be the first session rulesession    required     pam_selinux.so closesession    required     pam_loginuid.sosession    optional     pam_console.so# pam_selinux.so open should only be followed by sessions to be executed in the user contextsession    required     pam_selinux.so opensession    required     pam_namespace.sosession    optional     pam_keyinit.so force revokesession    include      system-auth-session   optional     pam_ck_connector.so[[email protected] pam.d]$


/etc/pam.d/login 檔案沒有添加資源限制模組,這裡應該添加一行

session required /lib64/security/pam_limits.so

經過網上尋找資料,發現Oracle MOS上面的一個文檔,和我們的情況完全一致:

The processes and resources started by CRS (Grid Infrastructure) do not inherit the ulimit setting for "max user processes" from /etc/security/limits.conf setting (文檔 ID 1594606.1)

650) this.width=650;" src="/e/u261/themes/default/images/spacer.gif" border="0" style="background:url("/e/u261/lang/zh-cn/images/localimage.png") no-repeat center;border:1px solid #ddd;" alt="spacer.gif" />

通過驗證,發現雖然我們的grid使用者的ulimit -u已經設定為10240.但是實際啟動並執行時候依然是1024.

這個是Oracle的一個Bug 17301761 ,我們的資料庫版本是11.2.0.4,正好是這個bug的影響範圍.

解決辦法有兩個,

1. 打補丁

2. 通過MOS給出的辦法進行規避,如下:

The ohasd script needs to be modified to setthe ulimit explicitly for all grid and database resources that are started bythe Grid Infrastructure (GI).

1) go to GI_HOME/bin

2) make a backup of ohasd script file

3) in the ohasd script file, locate thefollowing code:

    Linux)
        # MEMLOCK limit is for Bug 9136459
        ulimit -l unlimited
        if [ "$?" != "0"]
        then
            $CLSECHO -phas -f crs -l -m 6021 "l" "unlimited"
        fi
        ulimit -c unlimited
        if [ "$?" != "0"]
        then
            $CLSECHO -phas -f crs -l -m 6021 "c" "unlimited"
        fi
        ulimit -n 65536

In the above code, insert the following linejust before the line with "ulimit -n 65536"

       ulimit -u 16384

4) Recycle CRS manually so that the ohasdwill not use new ulimit setting for open files.
After the database is started, please issue "ps -ef | grep pmon" andget the pid of it.
Then, issue "cat /proc/<pid of the pmon proces>/limits | grepprocess" and find out if the Max process is set to 16384.
Setting the number of processes to 16384 should be enough for most serverssince having 16384 processes normally mean the server to loaded veryheavily.  using smaller number like 4096 or 8192 should also suffice formost users.
In addition to above, the ohasd template needs to be modified to insure thatnew ulimit setting persists even after a patch is applied.
1) go to GI_HOME/crs/sbs

2) make a backup of crswrap.sh.sbs

3) in crswrap.sh.sbs, insert the followingline just before the line "# MEMLOCK limit is for Bug 9136459"

       ulimit -u 16384
Finally, although the above setting is successfully used to increase the numberof processes setting, please test this on the test server first before settingthe ulimit on the production.



參考:http://blog.csdn.net/weiwangsisoftstone/article/details/42460585


/limits.conf Oracle bug引起的進程不夠用

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.