Physical Standby Database Failover
1. When physical standby failover switches to failover, the production database must be moved to the DR site for unplanned events that occur when some bad events occur. Interestingly, at this time people are usually busy trying to figure out what happened and what needs to be done to restore the database. The customer called to ask the system for verification before recovery. Senior management staff lingers in the corridor to find out who was wrong. For DBA, nothing is more terrible than the standby site in time and space, and we do not know whether it works before we need the standby site. If the Failover fails, DBAs will lose their bonuses or even interrupt their careers. Therefore, at the initial stage of system establishment, we need to have a well-developed support plan. Once a problem occurs, you are not busy with your hands. It's time to show off your skills. However, it was important to have hands-on with detailed experiments! Step 1. redo anything that has not been sent from the primary database to the standby database. If the primary database cannot be Mounted, archive and current logs that have not been sent from the primary database may be sent to the standby database. If this operation succeeds, zero data loss is possible even if it is not in the max protection mode. Make sure that redo apply is active on the target backup database. Mount. Do not open the primary database. If the primary database cannot be mounted, jump to step 2. run the following SQL statement in primary database: SQL> ALTER SYSTEM FLUSH REDO TO target_db_name; Target_db_name is the DB_UNIQUE_NAME of the target backup database, and it will be washed over and redone. This statement is only supported in 11G R2. If the statement is completed but there is no error, jump to step 5. if an error occurs during the execution or it does not stop, because you cannot wait for a long time until the execution of this statement is complete, continue step 2. step 2. Verify the archive log files recently sent from the primary database in the backup database. Query the V $ ARCHIVED_log view and obtain the maximum serial number archived by each thread instance in the target backup database. For example, SQL> SELECT UNIQUE THREAD # AS THREAD, MAX (SEQUENCE #) over (partition by thread #) as last from v $ archived_log; thread last ---------- 1 517 2 412 if possible, copy the archived redo logs of each primary database to the backup database if those logs are not in the backup database. And register it. It must be done on every redo thread. For example: SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE '<from your list>'; Step 3 identify and process any archived redo log interval (archived redo log gaps) query the V $ archive_gap view on the target backup database to determine whether there is any redo interval. For example: SQL> SELECT THREAD #, LOW_SEQUENCE #, HIGH_SEQUENCE # FROM V $ ARCHIVE_GAP; THREAD # LOW_SEQUENCE # HIGH_SEQUENCE # ---------- ----------- ------------ 1 90 92 you can see that there is a redo interval, logs numbered 90, 91, and 92 need to be archived by thread 1. If possible, copy these missing archive redo files to the target redo database. Then register them on each thread node. SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1'; Step 4 repeat Step 3 until the redo interval is fixed Step 3 is only the highest interval. After the resolution interval, you must repeat the query until the interval does not exist. If the interval is not resolved, some data loss will be lost during the Failover. Step 5 stop Redo Apply execute the following SQL statement on the target backup DATABASE: SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; step 6: execute the following SQL statement on the target backup DATABASE for all applications that have completed data redo: SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH; if the statement has no errors, move to Step 7. if an error occurs, some accepted redo data is not applied. Try to solve the cause of these errors and re-determine the status before proceeding to the next step. If the error cannot be solved, failover can also be performed (along with some data loss). Run the following SQL statement in standby databse: SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE; in this case, a forced switchover is performed, and the switch goes to Step 9. When the activation status is complete. Step 7 confirm that the target backup DATABASE is ready to become a master data. Execute the following statement to confirm the DATABASE status: SQL> SELECT SWITCHOVER_STATUS FROM V $ DATABASE; SWITCHOVER_STATUS-----------------TOPRIMARY1 row selected if this value is TOPRIMARY or sessions active indicates that the master database can be switched to Standby role. If it is another value, the SWITCHOVER cannot be performed. Step 8 switches the physical backup database to the master DATABASE role SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN; step 9 OPEN the new primary database SQL> ALTER DATABASE OPEN; Step 10 back up the new primary DATABASE Oracle It is recommended to make a full DATABASE backup for the new primary DATABASE. Step 11 restart Redo ApplyFor example on the physical standby database: SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE-> DISCONNECT FROM SESSION; Step 12 when available, the master database fails to be rebuilt. After failover, the source primary database may be converted to a physical standby database or rebuilt from the backup of the new primary database. Once the source Primary database is running in Standby Role, you can re-build it back to Primary role after a switchover. 2. FAILOVER switched to Real-Time SQL> alter system flush redo to 'hainance'; alter system flush redo TO 'haina' * ERROR at line 1: ORA-16442: ALTER SYSTEM FLUSH REDO to STANDBY succeeded, but some redo has notbeen applied. the preceding steps take N long. View the alert logs on the master and slave databases: the master database contains a large number of the following logs: Fri Jul 12 11:23:22 2013ARC4: archiving logs is a potential flush redo targetLOG_ARCHIVE_DEST_2 is a potential flush redo logs is a potential flush redo target slave database has the following large volumes of logs: fri Jul 12 11:23:30 2013 Standby switchover readiness check: Checking whether recoveryapplied all redo .. physic Al Standby applied all the redo from the primary. standby switchover readiness check: Checking whether recoveryapplied all redo .. physical Standby applied all the redo from the primary. it can be seen that it takes a lot of time for the master database to verify each archive of the slave database. Therefore, at the initial stage of System Establishment, two important information should be considered: the recovery point objective (RPO) and recovery time objective (RTO) that the company needs to achieve ). Although saw the succeeded words, but still according to the error processing, the official website to search the ORA-16442, there is no result, the New still has shortcomings, go down. SQL> SELECT UNIQUE THREAD # AS THREAD, MAX (SEQUENCE #) over (partition by thread #) as last from v $ archived_log; thread last ---------- 2 415 1 519 SQL> SELECT THREAD #, LOW_SEQUENCE #, HIGH_SEQUENCE # FROM V $ ARCHIVE_GAP; no rows selected query on the master database, get the same result, it indicates that all logs are applied. Because I have applied standby log and enabled the real time application, the probability of data loss is greatly reduced, which is also recommended by oracle. SQL> SELECT UNIQUE THREAD # AS THREAD, MAX (SEQUENCE #) over (partition by thread #) as last from v $ archived_log; thread last ---------- 2 415 1 519 perform the following operations on the slave Database: SQL> SELECT THREAD #, LOW_SEQUENCE #, HIGH_SEQUENCE # FROM V $ ARCHIVE_GAP; no rows selected SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; Database altered. SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH; Databa Se altered. SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE; Database altered. SQL> SELECT SWITCHOVER_STATUS from v $ DATABASE; SWITCHOVER_STATUS--------------------NOT allowed SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN; ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN * ERROR at line 1: ORA-01109: database not open SQL> select status from V $ ins Tance; STATUS---mounted SQL> alter database open; Database altered. SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN; ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN * ERROR at line 1: ORA-16109: failed to apply log data from previous primary SQL> select status from V $ instance; SQL> SELECT SWITCHOVER_STATUS FROM V $ DATABASE; SWITCHOVER_STATUS ----- Failed destination select nation, OPEN_MODE, DATABASE_ROLE, PROTECTION_MODE from v $ DATABASE; BASE; using OPEN_MODE DATABASE_ROLE PROTECTION_MODE when there are too many failed destination read write primary maximum performance view the initial is, after switching, it becomes failed destination, but the database can be written and primary. Restart the database: SQL> startup forceORACLE instance started. total System Global Area 943669248 bytesFixed Size 2234000 bytesVariable Size 264243568 bytesDatabase Buffers 671088640 bytesRedo Buffers 6103040 bytesDatabase mounted. database opened. SQL> select status from v $ instance; STATUS--OPEN View alert Log: Fri Jul 12 14:05:14 2013 PING [ARC5]: heartbeat failed to connect to standby 'beijing '. error is 160 09. PING [ARC5]: Heartbeat failed to connect to standby 'beijing '. error is 16009. PING [ARC5]: Heartbeat failed to connect to standby 'beijing '. error is 16009.Fri Jul 12 14:06:14 2013 PING [ARC5]: Heartbeat failed to connect to standby 'beijing '. error is 16009. PING [ARC5]: Heartbeat failed to connect to standby 'beijing '. error is 16009. with a lot of ORA-16009 errors, it keeps trying to connect to beijing this standby database because I'm doing failov In this case, standby is disabled. SQL> show parameter log_archive_dest_2 NAME TYPE VALUE =----------- export log_archive_dest_2 string SERVICE = beijing ASYNC VALID_FO R = (ONLINE_LOGFILES, PRIMARY_ROL E) DB_UNIQUE_NAME = beijingSQL> alter system set log_archive_dest_2 = ''scope = spfile; System altered. SQL> startup forceSQL> show parameter log_archive_dest_2 NAME TYPE VALUE -------- ---------------------------- ----------- -------------------------------- Log_archive_dest_2 string cancels the remote archiving path and eliminates errors.