Oracel DG技術本身,是藉助redo log的傳遞和應用,確保在standby端資料和primary端保持一致資料。在這個過程中,Redo Transport和Redo Apply是兩個核心動作。Redo Transport是將Redo Log資訊傳遞到Standby端,等待進行Apply。而Redo Apply就是將這些日誌應用執行,更改Standby端的資料,來實現一致。
下面實驗,就是利用alert log來觀察一對Primary和Standby在啟動過程、工作過程中傳遞日誌的情況。從而證明Oracle DG的工作特點和機制。
相關參考:
Oracle Data Guard 重要配置參數
基於同一主機配置 Oracle 11g Data Guard
探索Oracle之11g DataGuard
Oracle Data Guard (RAC+DG) 歸檔刪除策略及指令碼
Oracle Data Guard 的角色轉換
Oracle Data Guard的日誌FAL gap問題
Oracle 11g Data Guard Error 16143 Heartbeat failed to connect to standby 處理方法
1、環境介紹
我們在Oracle 11g上進行實驗,版本為11.2.0.4。由於環境限制,筆者Primary和Physical Standby在相同伺服器上。Primary執行個體名稱為ora11g,Standby執行個體名為ora11gsy。
監聽程式首先關閉,來查看資料庫行為。
[oracle@SimpleLinux ~]$ lsnrctl status
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 27-APR-2014 13:40:15
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SimpleLinux)(PORT=1521)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Linux Error: 111: Connection refused
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Linux Error: 111: Connection refused
主庫Primary日誌。
[root@SimpleLinux ~]# su - oracle
[oracle@SimpleLinux ~]$ cd /u01/app/diag/rdbms/ora11g/ora11g/trace/
[oracle@SimpleLinux trace]$ ls -l | grep alert
-rw-r-----. 1 oracle oinstall 176813 Apr 21 21:58 alert_ora11g.log
2、Primary端啟動過程
首先啟動資料庫到nomount狀態,此時pmon是進行工作的。
[oracle@SimpleLinux ~]$ env | grep ORACLE_SID
ORACLE_SID=ora11g
[oracle@SimpleLinux ~]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Sun Apr 27 13:54:15 2014
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 372449280 bytes
Fixed Size 1364732 bytes
Variable Size 331353348 bytes
Database Buffers 33554432 bytes
Redo Buffers 6176768 bytes
這個階段日誌是沒有什麼額外特殊的資訊的,只有正常的後台執行個體啟動。
Sun Apr 27 13:54:58 2014
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
NUMA status: non-NUMA system
cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
Grp 0:
(篇幅原因,有省略……)
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Sun Apr 27 13:55:07 2014
MMNL started with pid=16, OS id=1776
starting up 1 shared server(s) ...
ORACLE_BASE from environment = /u01/app
切換到mount狀態。
SQL> alter database mount;
Database altered.
日誌中,定位到mount狀態。
Sun Apr 27 14:03:18 2014
alter database mount
Sun Apr 27 14:03:23 2014
Successful mount of redo thread 1, with mount id 4242195174
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount
之後啟動資料庫,到open狀態。在mount之前,資料庫是不會產生和執行redo相關的動作的。從mount到open階段,是需要進行一個instance recovery過程的,也就是日誌前滾後滾的動作。在mount和mount之前,是不會有Redo Transport過程的。
Sun Apr 27 14:24:56 2014
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 78 KB redo, 26 data blocks need recovery
Started redo application at
Thread 1: logseq 32, block 47
Recovery of Online Redo Log: Thread 1 Group 1 Seq 32 Reading mem 0
Mem# 0: /u01/app/oradata/ORA11G/onlinelog/o1_mf_1_9mnjwtj9_.log
Mem# 1: /u01/app/fast_recovery_area/ORA11G/onlinelog/o1_mf_1_9mnjwvdm_.log
Completed redo application of 0.02MB
Completed crash recovery at
Thread 1: logseq 32, block 203, scn 815633
26 data blocks read, 26 data blocks written, 78 redo k-bytes read
Sun Apr 27 14:24:58 2014
LGWR: STARTING ARCH PROCESSES
Sun Apr 27 14:24:58 2014
Fatal NI connect error 12541, connecting to:
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.137.88)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=ora11gsy)(CID=(PROGRAM=oracle)(HOST=SimpleLinux.localdomain)(USER=oracle))))
VERSION INFORMATION:
TNS for Linux: Version 11.2.0.4.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Time: 27-APR-2014 14:25:03
Tracing not turned on.
Tns error struct:
ns main err code: 12541
TNS-12541: TNS:no listener
ns secondary err code: 12560
nt main err code: 511
TNS-00511: No listener
nt secondary err code: 111
nt OS err code: 0
Check whether the listener is up and running.
FAL[server, ARC3]: Error 12541 creating remote archivelog file 'ora11gsy'
FAL[server, ARC3]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance ora11g - Archival Error. Archiver continuing.
Starting background process QMNC
Sun Apr 27 14:25:11 2014
QMNC started with pid=25, OS id=2129
Sun Apr 27 14:25:25 2014
Completed: alter database open
Sun Apr 27 14:25:28 2014
Starting background process CJQ0
Sun Apr 27 14:25:28 2014
CJQ0 started with pid=28, OS id=2149
Setting Resource Manager plan SCHEDULER[0x32DF]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Apr 27 14:25:34 2014
Starting background process VKRM
Sun Apr 27 14:25:34 2014
VKRM started with pid=29, OS id=2151
Open過程完成,但是資料庫日誌不斷報錯。主要體現在兩個方面,一個是監聽器故障,Primary在傳遞日誌的時候,archive_log_dest配置兩種類型,location和service。其中service就存放對應服務名稱。這個服務名稱是藉助Oracle NET實現,Oracle NET外表體現就是監聽器。另一種類型錯誤就是心跳機制,Oracle Primary在不斷的訪問standby。
在之後過程,日誌中不斷報錯。說明這個過程是一個周期性確認的過程。
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.137.88)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=ora11gsy)(CID=(PROGRAM=oracle)(HOST=SimpleLinux.localdomain)(USER=oracle))))
VERSION INFORMATION:
TNS for Linux: Version 11.2.0.4.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Time: 27-APR-2014 14:30:00
Tracing not turned on.
Tns error struct:
ns main err code: 12541
TNS-12541: TNS:no listener
ns secondary err code: 12560
nt main err code: 511
TNS-00511: No listener
nt secondary err code: 111
nt OS err code: 0
Error 12541 received logging on to the standby
Check whether the listener is up and running.
PING[ARC1]: Heartbeat failed to connect to standby 'ora11gsy'. Error is 12541.
心跳12504錯誤,資訊為監聽器沒有啟動。
[oracle@SimpleLinux ~]$ oerr ora 12504
12504, 00000, "TNS:listener was not given the SERVICE_NAME in CONNECT_DATA"
// *Cause: The listener was not configured with a default service and
// SERVICE_NAME was missing from the CONNECT_DATA received by the listener.
// *Action: Possible solutions are:
// - Configure DEFAULT_SERVICE parameter in LISTENER.ORA with a valid service
// name. Reload the listener parameter file using reload [<listener_name>].
// This should be done by the LISTENER administrator.
// - If using a service name, Check that the connect descriptor corresponding
// to the service name in TNSNAMES.ORA has a SERVICE_NAME or SID component
// in the CONNECT_DATA.
更多詳情見請繼續閱讀下一頁的精彩內容: