First.
Data protection
In the era of cloud and big data, the massive increase in data capacity has brought new challenges to data storage and protection. From the traditional and familiar IT architecture to the technology upgrade and iteration represented by cloud architecture, virtualization, and hyper-convergence, data The technical means of protection should also be accelerated.
1. The importance of data protection
Data is an important means of production for an enterprise, and the loss of key data may be a fatal blow to the enterprise. For example, in the 911 incident, Bank NewYork was forced into bankruptcy and liquidation several months later due to the loss of data.
Why are the consequences so serious? Because data is the reason and foundation for the existence of computer systems, data is often non-renewable. Once data loss occurs, the company will be in trouble: customer data, technical documents, financial accounts and other customer, transaction, and production data may be damaged beyond recognition.
2. The possibility of data loss
In summary, there are three levels of data loss. The first is logical errors, including software bugs, virus attacks, and data block damage; the second is physical damage, including server and disk damage, and the third is the destruction of data centers by natural disasters.
Data hazards are happening all the time. For example, the "deleted database and runaway, backdoor vulnerability, system vulnerability, cloud service provider failure, misoperation configuration, data center fire" and other accidents that have occurred are the most painful in terms of data loss. Lessons.
3. Data replication technology
In order to deal with the loss caused by data loss, data must be copied and protected, and the higher the degree of enterprise informatization, the more important the relevant recovery language. General data goes from production to storage, mainly through applications, middleware, databases, operating systems, storage or disk drives, server hardware, networks, and storage switches to storage. On the basis of traditional data backup and recovery, multiple data copies are provided through data replication technology to ensure the availability of copy data to achieve data protection.
From a technical point of view, it is divided into middleware and application layer replication, database layer replication, host operating system and storage layer replication.
Data replication at the middleware and application layer is a double write at the middleware or application layer. According to business requirements, the master and copy of the data are updated through application architecture design; strong consistency, weak consistency, and final consistency are implemented as needed Design to ensure the consistency, completeness, and timeliness between the master and the copy.
Database layer replication: Regardless of whether it is an open database or a large computer database, relevant data replication software is provided to realize the physical and logical replication of database data. The main technical genres include logical replication and physical replication. The former uses the redo logs and archive logs of the database to transfer the logs of the site where the master is located to the site where the copy is located, and implements data replication by redoing SQL. Logical replication only provides asynchronous replication. The final consistency of the primary copy data cannot guarantee real-time consistency; the latter realizes the replication function through the synchronization or asynchronous persistent writing of Redo logs or archive logs at the replica site, and Redo Apply. At the same time, the replica site The data can provide read-only function.
Host operating system layer and storage layer replication: Based on the system's IO, underlying physical volumes, and data blocks, data replication is realized through technologies such as storage hardware, backup and recovery, and storage virtualization, regardless of upper-layer applications and logic. The main technical genres include disk mirroring technology, volume management-based data replication technology at the operating system layer, storage virtualization technology at the storage layer, optimized backup and recovery technology and centralized network data storage management technology, and the British-specific system kernel byte-level data Copy technology, etc.
Second,
disaster recovery backup
These are actually two independent concepts. Backup is not equal to disaster tolerance. Backup is to protect data, and disaster tolerance is to ensure business continuity. After the emergence of disaster recovery all-in-one machines, the functions represented by these two concepts are often included in it. This also causes some users to confuse backup and disaster recovery products when purchasing pure software products, so that manufacturers do not know the users. Need a backup product or a disaster recovery product, or a backup + disaster recovery product.
1. Backup
Backup is a copy of a pre-defined data collection and a fundamental method of data protection. It reflects the static state of the data collection at a certain moment. Backup files are the backbone of all data protection architectures. The purpose of backup is to restore.
Regarding backup, there are two misunderstandings: one is that dual-system hot backup is not a backup; the other is that hardware backup ≠ data backup.
In terms of the evolution of the backup architecture, there are four architectures: local backup, network backup, LAN-Free (SAN) backup, and Server-Free (offline) backup.
Local backup: The advantage is that the backup speed is fast and the structure is simple; the disadvantage is that it is not suitable for a multi-host environment and the management of multi-host backup is complicated.
Network backup: The advantage is centralized backup, centralized management, and full use of tape library resources; the disadvantage is that it occupies network resources, and the network bottleneck is very obvious when a large amount of data is backed up.
LAN-Free (SAN) backup: The advantage is that the backup speed is fast, there is no backup bottleneck caused by the traditional network, and it is suitable for high-speed backup of large amounts of data; the disadvantage is that the price is relatively high.
Server-Free (offline) backup: The advantage is that the production server will not cause additional overhead for backup, and the performance of the production system will not be reduced at all; the disadvantage is that special equipment is required.
In terms of the evolution route of backup technology, it follows the route of regular backup-snapshot backup-real-time backup.
Scheduled backup: The advantage is that the software and hardware support a wide range of backups, which is suitable for long-term storage; the disadvantage is that the file backup needs to be opened, and the files in the folder change, resulting in inconsistencies. In addition, a special backup time window is required, and the RPO is also very large .
Snapshot backup: In order to solve the problem of opening files and file changes during backup; the disadvantage is the compatibility problem of snapshot backup. Snapshot backup has a greater impact on the performance of the production system, and the RPO is also greater.
Real-time replication (such as CDP): In order to solve the problem of opening files, file changes during backup, and limited compatibility of snapshots, it is possible to restore RPO≈0 at any point in time. The British CDP technology based on byte-level real-time data protection is one of the representative products.
2. Disaster tolerance
Backup is the protection of data, and disaster tolerance is based on backup to ensure business continuity of the enterprise. From this level, disaster tolerance is generally divided into data disaster tolerance and application disaster tolerance.
Data disaster tolerance refers to the establishment of a remote data system, which is a real-time replication of local key application data.
Application disaster tolerance refers to the establishment of a complete set of backup application systems equivalent to the local production system on the basis of data disaster tolerance. In the event of a disaster, the standby system quickly takes over the business and continues to run.
Third. Key terms: RPO, RTO
RPO (Recovery Point Objective) means that after a disaster, the disaster recovery system can restore data to the point in time before the disaster. It is an indicator of how much production data an enterprise will lose after a disaster. RPO can be simply described as the maximum amount of data loss that an enterprise can tolerate.
RTO (Recovery Time Objective) refers to the time between when a disaster occurs, from the moment when the system is down and the business stops, to when the system is restored to support the operation of the business department and the business resumes operations. RTO can be simply described as the recovery time that the enterprise can tolerate.