HP-UNIX系統宕機

來源:互聯網
上載者:User


HP-UNIX系統宕機 早上進行db例行巡檢的時候發現一個節點2宕機。首先檢查檢點2的alter,沒有任何發現,而且crs各項資源也offline狀態,可以判斷應該
是系統宕機導致沒有任何記錄。  www.2cto.com  通過查看系統登入日誌發現有過重啟記錄:# last | grep Decroot     pts/1        Mon Dec 17 10:08   still logged inroot     pts/0        Mon Dec 17 09:33   still logged inreboot   system boot  Sun Dec 16 08:16   still logged inreboot   system boot  Sat Dec 15 23:59 - 08:16  (08:16) 但是就是不知道系統重新資訊會不會也記錄到這裡,而且看第3條記錄,還still logged in這個
只能交給HP工程師來處理了。檢查/etc/shutdownlog發現如下新:00:03  Sun Dec 16 2012.  Reboot after panic: MCA, IIP:0xe0000000008a1a40 IFA:0xc000000006dae00008:18  Sun Dec 16 2012.  Reboot after panic: MCA, IIP:0xe000000000d650a0 IFA:0x20000000777db0cc  www.2cto.com  檢查節點1的alter日誌發現如下資訊:Sat Dec 15 23:55:30 2012Errors in file /opt/oracle/product/admin/xxx/udump/xxx1_ora_4074.trc:Sat Dec 15 23:55:31 2012Errors in file /opt/oracle/product/admin/xxx/udump/xxx1_ora_4074.trc:Sat Dec 15 23:55:34 2012Reconfiguration started (old inc 100, new inc 102)List of nodes: 0 檢查crs日誌如下:2012-12-15 23:55:16.183[cssd(4229)]CRS-1612:node xxx2 (0) at 50% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:23.183[cssd(4229)]CRS-1611:node xxx2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:24.181[cssd(4229)]CRS-1611:node xxx2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:28.183[cssd(4229)]CRS-1610:node xxx2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:29.180[cssd(4229)]CRS-1610:node xxx2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:30.183[cssd(4229)]CRS-1610:node xxx2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds2012-12-15 23:55:30.682[cssd(4229)]CRS-1607:CSSD evicting node xxx2. Details in /opt/oracle/product/crs/log/xxx1/cssd/ocssd.log.[cssd(4229)]CRS-1601:CSSD Reconfiguration complete. Active nodes are xxx1 . 檢查cssd日誌如下:[    CSSD]2012-12-15 23:55:16.183 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 50 2.000000e+00artbeat fatal, eviction in 14.489 seconds[    CSSD]2012-12-15 23:55:16.183 [18] >TRACE:   clssnmPollingThread: node xxx2 (2) is impending reconfig, flag 1037, misstime 15511[    CSSD]2012-12-15 23:55:16.183 [18] >TRACE:   clssnmPollingThread: diskTimeout set to (27000)ms impending reconfig status(1)[    CSSD]2012-12-15 23:55:23.183 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 75 2.000000e+00artbeat fatal, eviction in 7.489 seconds[    CSSD]2012-12-15 23:55:24.181 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 75 2.000000e+00artbeat fatal, eviction in 6.490 seconds[    CSSD]2012-12-15 23:55:28.183 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 90 2.000000e+00artbeat fatal, eviction in 2.489 seconds[    CSSD]2012-12-15 23:55:29.180 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 90 2.000000e+00artbeat fatal, eviction in 1.491 seconds[    CSSD]2012-12-15 23:55:30.183 [18] >WARNING: clssnmPollingThread: node xxx2 (2) at 90 2.000000e+00artbeat fatal, eviction in 0.489 seconds可以獲知節點2在這個時刻已經在重新設定叢集了,將節點2剔除了叢集。在通過將儲存active之後,叢集自動在節點2啟動並恢複正常生產。 通過/var/adm/syslog/syslog.log 和old日誌發現節點系統重啟了,奇怪的是竟然沒有重啟之前的日誌資訊,只能打包/var/adm/crash目錄下的系統crash(可以通過 q4 crash檔案大概查看一下)日誌資訊給HP技術服務人員。-The End- 

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.