Data Warehouse Environment, ORACLE rac,100t data, daily archive that amount of 5T (for data that does not need to be backed up, has been adopted in a nologging way to reduce the number of files), how to develop backup and recovery scenarios?
Programme one: DataGuard
Dataguard is the most cost-effective backup and disaster recovery program, but when the archive is over a certain scale, DG's recovery becomes a bottleneck, the daily archive cannot be recovered in time, and we have tried many tuning methods, including parallel restores, which cannot be solved, and the recovery bottleneck is not stored throughput, Rather, the standby recovery approach, because the process of recovery is the application of the archive file, RAC nodes generated by the archive must be restored at a node, the process must be followed in a certain order, greatly limiting the recovery of the concurrency rate.
Scenario Two: Traditional Rman backup
The traditional Rman backup, with the large throughput of the virtual library equipment, a weekly full standby, daily archive log. Most of the time, when we do a backup solution, we only consider the backup, but we do not consider the recovery. The biggest problem with this scenario is that the cost of recovery is very high and it can take days to recover if there is a problem with the database. In addition, additional backup equipment is purchased.
Scenario Three: Storage mirroring
The database uses Noarchivelog mode, with ASM mirroring two sets of storage. This scenario is not a backup solution, but is proposed to address a single point of storage, which is equivalent to RAID 1 for different storage. The biggest problem with this scenario is that you cannot resolve database logic errors, such as accidentally deleting data. Because the primary and standby libraries are implemented through storage mirroring, offsite backups and disaster recovery are not possible.
Scenario Four: Storage-level replication
With storage-level replication, storage vendors have solutions, such as http://www.aliyun.com/zixun/aggregation/14451.html ">emc SRDF". VERITAS also has similar solutions, such as volume replication (VERITAS Volume Replicator). The rationale for this scenario is to capture the IO of the underlying storage and synchronize it to the backup system over the network. If the storage vendor's solution is adopted, then the primary repository must use the same company's products, and we have not validated the ability to withstand 4.5T of data changes per day. In addition, software license expensive.
Some people say: the problem that can be solved by money is not a problem. But the problem is no money! Although Alibaba is not short of money, but our goal is to spend a small amount to do big things. I personally do not recommend the use of storage vendor solutions, it is not just a matter of money, but this scheme is basically a black box, we still prefer a simpler and more open solution.
Now that Oracle DG is the most cost-effective backup and disaster recovery scenario, we want to solve this problem with DG. The advantage of DG is that it can be turned on at any time, validation is effective, and standby deferred recovery can also resolve logical errors and protect against possible losses from Oracle software bugs. The core of the solution is to solve the problem of slow DG recovery.
Scenario Five: ORACLE dg+ block level incremental backup/recovery + Archive
Starting at 10g, Oracle provides a feature: chunk change Tracking (block changes tracking), through the bitmap record block changes, through this chunk changes the trace file, know which blocks have changed, greatly improve the efficiency of incremental backup. The specific scenario is: first to establish a 0-level backup of the database (standby), and then apply level 1 backup to level 0 backup, equivalent to the recovery process, this recovery than the application of the archive log much faster, why? Since backups are all variable blocks, it's OK to just overwrite the old block Therefore, there is no sequential problem in the process of log recovery, so the degree of parallelism of the recovery can be very large, which can give full play to the throughput ability of the equipment. In addition, when a block is repeatedly changed multiple times, the incremental backup only needs to back up the latest block, the recovery will only overwrite the old block, the regular incremental backup will actually reduce the amount of space needed for backup. The redo file records the block changes, so when you apply a redo restore, you need to change it multiple times, and you must keep all the files to restore success. Of course, after the level 1 backup is applied, standby does not open because the block is not consistent (because the incremental backup lasts a long time, and in this process, the time points of the backup block are inconsistent), So you can use the archive to push standby to a consistent state before you open it.
Our current scenario: Build a standby database, make an incremental backup every week, first apply an incremental backup, and then apply the archive log file to push the database to a consistent state, open the database, verify that the backup is valid, archive log files are cycled back to the tape library, and the entire process is automated through scripting. This scheme uses incremental backup +archivelog recovery standby, can open standby verify the backup is valid, the failure can direct standby switchover, save the time of recovery greatly. Moreover, this program is based on the existing hardware, basically does not purchase additional 11585.html "> Hardware devices and Software license, spend a penny to do big things."
PostScript: This question, I once asked Oracle's technical experts on the OOW, they also did not have the very good solution, suggested that we buy two sets of Exadata to solve (I did not understand why Exadata recover the archive faster, is the device itself the ability to improve, or the way Oracle recovery has changed), or the database-level backup is discarded, and the application writes multiple data to resolve it. So Oracle is not taking into account the backup problems of such a large data environment, and Oracle can consider promoting our solutions.
(Responsible editor: Lu Guang)