Oracle RAC Database 11.1.0.6監聽故障案例

來源:互聯網
上載者:User

Oracle RAC Database 11.1.0.6監聽故障案例

接到電話,客戶的一套核心Oracle RAC資料庫連接不上,串連時報無監聽程式,客戶的Oracle RAC版本為11.1.0.6,平台為AIX 6.1.05,使用了IBM HACMP 5.5.0.8。

當我遠程過去的時候,發現節點2已經沒有任何oracle使用者的進程,且concurrent的vg沒有啟用,HACMP的服務也offline。

另一個節點Oracle的執行個體是正常的,且有部分伺服器處理序依然在工作,但是本地監聽器出現了故障,導致新的串連無法串連到執行個體,通過crs_stat -t看到兩個執行個體的監聽也都是OFFLINE狀態。

在節點上並沒有發現有LISTENER進程,且手動殺掉了所有的伺服器處理序,在oracle使用者下啟動監聽時收到以下的報錯:

$ lsnrctl start listener_cdfy740a

LSNRCTL for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production on 20-NOV-2014 20:09:09

Copyright (c) 1991, 2007, Oracle.  All rights reserved.

Starting /oracle/app/oracle/product/11.1.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production
System parameter file is /oracle/app/oracle/product/11.1.0/db_1/network/admin/listener.ora
Log messages written to /oracle/app/oracle/diag/tnslsnr/cdfy740a/listener_cdfy740a/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.107.64.1)(PORT=1521)))
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.107.64.1)(PORT=1521)(IP=FIRST)))
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
  IBM/AIX RISC System/6000 Error: 67: Address already in use

Listener failed to start. See the error message(s) above...


10.107.64.1是該節點的vip地址,下面是RAC環境的hosts配置:

10.107.64.1    vip1
10.107.64.2    vip2
10.107.64.3    cdfy740a
10.107.64.4    cdfy740b
172.201.201.1  prv1
172.201.201.2  prv2

手動停掉該節點的nodeapps服務:

cdfy740a@root[/oracle/app/11.1.0/crs/bin]./srvctl stop nodeapps -n cdfy740a

成功停止後,VIP在主機層面已經消失:

cdfy740a@root[/oracle/app/11.1.0/crs/bin]ifconfig -a | more
en0: flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.200.200.1 netmask 0xffffff00 broadcast 172.200.200.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en1: flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.201.201.1 netmask 0xffffff00 broadcast 172.201.201.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en4: flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>
        inet 10.107.64.3 netmask 0xffffff00 broadcast 10.107.64.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
        tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

再次啟動節點nodeapps服務:

cdfy740a@root[/oracle/app/11.1.0/crs/bin]./srvctl start nodeapps -n cdfy740a
CRS-1006: No more members to consider
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:LSNRCTL for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production on 20-NOV-2014 20:13:07
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Copyright (c) 1991, 2007, Oracle.  All rights reserved.
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Starting /oracle/app/oracle/product/11.1.0/db_1/bin/tnslsnr: please wait...
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:TNSLSNR for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:System parameter file is /oracle/app/oracle/product/11.1.0/db_1/network/admin/listener.ora
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Log messages written to /oracle/app/oracle/diag/tnslsnr/cdfy740a/listener_cdfy740a/alert/log.xml
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.107.64.1)(PORT=1521)))
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.107.64.1)(PORT=1521)(IP=FIRST)))
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:TNS-12542: TNS:address already in use
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr: TNS-12560: TNS:protocol adapter error
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  TNS-00512: Address already in use
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  IBM/AIX RISC System/6000 Error: 67: Address already in use
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Listener failed to start. See the error message(s) above...
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:LSNRCTL for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production on 20-NOV-2014 20:13:08
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Copyright (c) 1991, 2007, Oracle.  All rights reserved.
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=vip1)(PORT=1521)))
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:TNS-12541: TNS:no listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr: TNS-12560: TNS:protocol adapter error
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  TNS-00511: No listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  IBM/AIX RISC System/6000 Error: 79: Connection refused
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.107.64.1)(PORT=1521)(IP=FIRST)))
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:TNS-12541: TNS:no listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr: TNS-12560: TNS:protocol adapter error
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  TNS-00511: No listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  IBM/AIX RISC System/6000 Error: 79: Connection refused
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.107.64.3)(PORT=1521)(IP=FIRST)))
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:TNS-12541: TNS:no listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr: TNS-12560: TNS:protocol adapter error
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  TNS-00511: No listener
cdfy740a:ora.cdfy740a.LISTENER_CDFY740A.lsnr:  IBM/AIX RISC System/6000 Error: 79: Connection refused
CRS-0215: Could not start resource 'ora.cdfy740a.LISTENER_CDFY740A.lsnr'.


之前使用lsnrctl status listener_cdfy740a查看監聽器狀態時也收到Connection refused的錯誤。

查看主機層面已經成功綁定了VIP地址:
cdfy740a@root[/oracle/app/11.1.0/crs/bin]ifconfig -a | more
en0: flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.200.200.1 netmask 0xffffff00 broadcast 172.200.200.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en1: flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.201.201.1 netmask 0xffffff00 broadcast 172.201.201.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en4: flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>
        inet 10.107.64.3 netmask 0xffffff00 broadcast 10.107.64.255
        inet 10.107.64.1 netmask 0xffffff00 broadcast 10.107.64.255
        tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
        tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

再次嘗試手動啟動本地監聽器:

cdfy740a@root[/]su - oracle
$ lsnrctl start listener_cdfy740a

LSNRCTL for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production on 20-NOV-2014 20:18:37

Copyright (c) 1991, 2007, Oracle.  All rights reserved.

Starting /oracle/app/oracle/product/11.1.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production
System parameter file is /oracle/app/oracle/product/11.1.0/db_1/network/admin/listener.ora
Log messages written to /oracle/app/oracle/diag/tnslsnr/cdfy740a/listener_cdfy740a/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.107.64.1)(PORT=1521)))
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.107.64.1)(PORT=1521)(IP=FIRST)))
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
  IBM/AIX RISC System/6000 Error: 67: Address already in use

Listener failed to start. See the error message(s) above...

啟動依然失敗。

檢查監聽器設定檔:
$ cat listener.ora
# listener.ora.cdfy740a Network Configuration File: /oracle/app/oracle/product/11.1.0/db_1/network/admin/listener.ora.cdfy740a
# Generated by Oracle configuration tools.

LISTENER_CDFY740A =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = vip1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = 10.107.64.1)(PORT = 1521)(IP = FIRST))
      (ADDRESS = (PROTOCOL = TCP)(HOST = 10.107.64.3)(PORT = 1521)(IP = FIRST))
    )
  )

在監聽設定檔中,vip1和10.107.64.1是兩個重複的地址,手動將10.107.64.1所在行去掉之後,監聽即可正常的��動。

之後恢複節點2的HACMP服務,Oracle RAC隨即恢複正常。

另外,還發現客戶的監聽日誌已經被填得很大,大概在1.6GB左右,過大的監聽記錄檔也會導致監聽器不穩定,這裡將兩個節點的監聽日誌進行了重新命名操作。

Oracle資料庫監聽非常慢,基本hang住故障處理

Oracle監聽之動態監聽與靜態監聽特點

Oracle 11g RAC 環境下單一實例非預設監聽及連接埠配置

Oracle 監聽器日誌配置與管理

Oracle錯誤- ORA-12514:TNS:無監聽程式

ORA-12514 監聽錯誤解決

Oracle監聽器出現的6種串連問題及其解決方案

Oracle LISTENER 未監聽到Oracle執行個體問題解決

設定 Oracle 監聽器密碼(LISTENER)

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.