This document is In-memory Computing with SAP HANA on Lenovo X6 Systems chapter seventh Business continuity and resiliency for SAP Hana reading notes.
Overview of Business Continuity options
There are different levels of business continuity and the level of adoption depends on the requirements
Developing a business continuity plan highly depends on the type of business a company is doing, and it differs (among oth Er factors) by country, regulatory requirements, and employee size.
Objectives of Business continuity:
* Recovery time Objective (RTO) defines the maximum tolerated time to get a system online again.
* Recovery Point Objective (RPO) defines the maximum tolerated time span to which data must be restored. It also defines the amount of time for which data are tolerated to be lost. An RPO of zero means that the system must is designed to not lose data in any of the considered events.
* Recovery Consistency Objective (RCO) defines the level of consistency of business processes and data that's spread out Over multitier environments.
Ha and Dr are different
HA covers a hardware failure for example, one node becomes unavailable because of a faulty processor, memory DIMM, Storag E, or network failure)
HA is implemented by introducing standby nodes. During normal operation, these nodes do not actively participate on processing data, but they does receive data that is REPL Icated from the worker nodes. If A worker node fails, the standby node takes over and continues data processing.
DR covers the event when multiple nodes in a scale-out configuration fail, or a whole data center goes down because of a f IRE, flood, or other disaster, and a secondary site must take over the SAP HANA system.
Hana's HA/DR can be implemented at two levels:
1. Infrastructure layer-Underlying data replication, such as storage replication based on the general Parallel File System (GPFS)
2. Application Layer-the same instructions are executed at both ends, and can be implemented through the SAP HANA System Replication (SSR), which does not support automatic failover
GPFS based storage replication
All of Lenovo's Hana Solutions are based on GPFS.
In the HA scenario, there are two data redundancy, and in the Dr scenario, there are three data redundancy. All data replication is synchronous.
SAP HANA System Replication
SSR is an application-based replication that supports both synchronous and asynchronous, but the log's apply only supports Asynchrony.
If the main point fails, failover can only be done by hand.
Support for cascading replication
The principle needs to be explained:
Every SAP HANA process running on the primary system ' s worker nodes must has a corresponding process on a seconda RY worker node to which it replicates it activity.
The only difference between the primary and secondary system are the fact that one cannot connect to the secondary HANA ins Tallation and run queries on this database. They can also
be called active and passive systems.
Upon start of the secondary HANA system, each process establishes a connection to its primary counterpart and requests the Data is in main memory, which is called a snapshot.
After the snapshot are transferred, the primary system continuously sends the log information to the secondary system is running in recovery mode. At the time of this writing, SSR
Does not support replaying the logs immediately as they is received; Therefore, the secondary site system acknowledges and persists the logs only. To avoid have to replay
Hours or days of transaction logs upon a failure, SSR asynchronously transmits a new incremental data snapshot Periodicall Y.
In SSR replication, standby node can host non-production applications.
Special considerations for DR and long-distance HA setups
Need to consider latency
Synchronization is generally not considered
HA and DR for Single-node SAP HANA
First explain the single node:
High Availability (HA) scenarios for SAP Business Suite with SAP HANA is supported, but is restricted to the simplest CA SE of the servers, one being the worker node and one acting as a standby node. The database is not partitioned, and the entire database is on a single node. This configuration is the sometimes also referred to as a single-node HA configuration. Because of these restrictions with regards to scalability, SAP decided to allow configurations with a higher memory per co Re ratio, specifically for the use case.
Single node is the case where there is only one work node, that is, non-scale out. There can be 2-3 of node physically.
Note that:
1. All HA schemes can be switched automatically, and all Dr must be manually switched
2. For all HA scenarios, standby node cannot accept workloads. Dr Solutions are available.
3. All of the HA scenarios, GPFS is a set, and Dr solution is two sets.
4. The replication of HA is synchronous, and Dr's replication can be synchronous or asynchronous.
High availability (by using GPFS)
Single data center, three physical node, worker (active), standby, and quorum node, respectively.
Worker node accepts all workloads, standby node is only for takeover and cannot handle workloads. Quorum node is used to prevent split brain.
Storage uses server local storage.
With synchronous replication, data is two redundant. No manual intervention required for switchover
Stretched high availability (by using GPFS)
Compared to single node ha, the distance is longer and the others are the same.
Called Stretched HA.
Quorum node should be placed in the third site, if the condition does not have, it is placed on the primary site.
Disaster recovery (by using GPFS)
synchronous replication of data.
Quorum node should be placed in the third site, if the condition does not have, it is placed on the primary site.
Note that this figure is very similar to the previous two, except that Hana DB is only on one worker node and the first two graphs, Hana DB, are cross-worker node and standby node.
And because it is a Dr rather than ha, it cannot be switched automatically. (all ha can automatically switch, all the DR can not automatically switch)
But the advantage is that standby node can accept workloads such as development and testing.
In fact, the difference between HA and Dr is similar to that of Oracle's RAC and ADG.
Disaster recovery (by using SAP HANA System Replication)
The previous scenario is a GPFS cluster, and in this scenario, the fusion of two nodes is implemented at the application level, not the GPFS layer. Therefore, two separate GPFS clusters are required, such as:
The switchover needs to be done manually, and replication can be synchronous or asynchronous.
HA plus DR (by using GPFS)
The GPFS scheme uses only one set of GPFS clusters.
The data has a three-point copy. Ha first implements local protection, and Dr implements site protection.
HA (by-using GPFS) plus DR (by using SSR)
Two sets of GPFS clusters, local and remote, with three copies of data. Two copies of local ha are implemented via GPFS, and the third copy of the disaster recovery is SSR-based.
The replication of an SSR can be synchronous or asynchronous depending on the distance.
HA and DR for Scale-out SAP HANA
Scale-out SAP HANA installations can implement-levels of redundancy to keep their database instance from going offline . The first step is to add a server node to the Scale-out
Cluster that acts as a Hot-standby node. The second step is to set up another scale-out cluster in a distinct data center that takes over operation if there is a D Isaster at the primary site.
Replication is still implemented via GPFS or SSR
HA by using GPFS storage replication
The use of the GPFs file system replication (HA is a total of two data), since it is scale-out, the use of the GPFs FPO version.
DR by using GPFS storage replication
In the DR scenario, GPFs has a total of three copies of the data.
Only a set of GPFS clusters, data copies for HA are replicated synchronously, and data copies for Dr can be asynchronous.
Quorum node prevents brain fissures caused by direct network interruption of primary and backup points.
In this scenario, the configuration of the disaster preparedness point is somewhat expensive.
The switchover is manual.
Disaster preparedness points can host non-production applications, such as QA or training environments.
HA by using the GPFS replication plus DR by using SAP HANA replication
Single-node failure can be via the standby node takeover (HA) of the primary point, and the multi-node failure can be switched to the standby via Dr.
Replication can be synchronous or asynchronous. A set of GPFS clusters and Hana DB Instances for both the primary and the backup points.
HA and DR for SAP HANA on Flex System
The Flex system is all-in-one, and the other concepts are the same, here slightly.
Backup and Restorebasic operating system backup and recovery
Backup at the operating system partition level.
Basic database Backup and Recovery
Saving the savepoints and the database logs technically is impossible in a consistent the IT, and thus does not constitute a Consistent backup from the which it can be recovered. Therefore, a simple file-based backup of the persistency layer of SAP HANA is insufficient.
SAP HANA Studio or SAP Hana SQL Interface initiates backups, HANA only supports full provisioning and does not support incremental backups.
The backup files is saved to a defined staging area the might is on the internal disks, an external disk on an NFS share , 8 or a directly attached SAN subsystem. In addition to the data backup files, the SAP HANA configuration files and backup catalog files must is saved to be recove Red. For point-in-time recovery, the log area also must is backed up.
Configuration files need to be backed up in addition to data
file-based Backup tool integrationdatabase backups by using GPFS snapshots
Principle:
GPFS supports a snapshot feature with which you can take a consistent and stable view of the file system so can then be Used to create a backup (which are similar to enterprise storage snapshot features). While the snapshot are active, GPFS stores any changes to files in a temporary delta area. After the snapshot was released, the Delta is merged with the original data and any further changes be applied on this dat A.
Taking only a GPFS snapshot does not ensure so you have a consistent backup so can use to perform a restore. SAP HANA must is instructed to flush out any pending changes to disk to ensure a consistent state of the files in the file System.
Yes, snapshots of the storage tier must be matched to the application to ensure data consistency. All database backups are the same, similar to freeze and thaw.
Backup tool integration with Backint for SAP HANA
Hana provides APIs that integrate with third-party backup tools, Backint, which can be thought of as similar to Rman in Oracle DB.
See http://scn.sap.com/docs/DOC-34483
Currently certified with Symentec NBU, EMC Networker, IBM and CommVault.
Tivoli Storage Manager for ERP 6.4
Slightly
Symantec NetBackup 7.5 for SAP HANA
Slightly
Backup and restore as a DR strategy
The use of backup and restore as a Dr. Solution is a basic-to-the-providing DR depending on the RPO, it might be a viable To achieve DR. The basic concept is to-back-the data on the primary site regularly (at least daily) to a defined staging area, which M Ight is an external disk on an NFS share or a directly attached SAN subsystem (this subsystem does not need to be dedicate D to SAP HANA). After the backup was done, it must was transferred to the secondary site, for example, by a simple file transfer (can be AUT omated) or by using the replication function of the storage system, which is used to the backup files.
The book's notes to this chapter are over, Thanks for you time, enjoy reading!
In-memory Computing with SAP HANA Reading notes-seventh chapter: Business continuity and resiliency for SAP Hana