What should you do if the recovery data grows too large and traditional disaster recovery methods cannot achieve the goal? Some new technologies, such as deduplication, storage Tiering, and data management policies, can help you reduce the high cost of disaster recovery, at the same time, it can also achieve the expected recovery time objective (rediscovery time objective or RTO ).
In the previous article, we gave an example of a company ignoring the updated disaster recovery plan. As a result, data storage continues to expand. Four years later, the data growth made the tape storage process completely unable to implement RTO. After the problem is found, its solution is to change the recovery technology from linear tape storage to asynchronous replication storage. Asynchronous replication storage is located more than 400 miles away from the primary data center. This solution will cost more.
In today's economic system, few IT companies make budgets for such changes. In addition, tape storage is always slower than backup. The industry has been focusing on tape backup time windows, so the recovery time is often lost. When have you seen the vendor announce the restoration time of its technology? I don't think there are any manufacturers doing this.
There are many relatively emerging technologies in the market. If combined with some technologies, you can avoid some expensive solutions when implementing RTO. Let's look at the following technologies:
Delete duplicate data
Recently, manufacturers have talked about the data deduplication technology. If you do not know the technology, you should understand it. Deduplication is a higher form of compression, which can identify duplicate files in the file system. This technology can also be used at the data block level to find duplicate data blocks in the disk volume. So what are the benefits of it? It can find and delete duplicate files or data blocks in the storage volume and replace the duplicate data with pointers pointing to the "primary data file" or "primary data block. In this way, the amount of stored data is greatly reduced. In addition, it can be combined with the traditional File compression technology to compress the size of the "main file.
Can deduplication help us process all the data in the entire data center? The answer is no. This technology is especially useful for unorganized data-file servers that store employee office production files. Think about this situation: When the Human Resources Department sends a public welfare plan to employees, most employees Save the information. In this way, the number of copies of the same information is almost the same as the number of employees. Worse, you have to save all the information and data during disaster recovery. What is the effect of deduplication for well-organized data? Some databases are composed of some large files, so file-level deduplication is not helpful. However, deletion at the block level can help reduce data, but the reduction may not be too large. However, it can be helpful even if it can be reduced by a little bit. Similar databases with the same data block can delete duplicate data. However, databases with less duplicate information will not receive very obvious results.
Most of the latest backup and storage vendor products include deduplication. Therefore, you should consider product updates so that you can use these features, especially to delete data volumes with disordered structures. The smaller the data, the shorter the recovery time required for disaster recovery.
Storage Classification
Storage Tiering can also reduce data growth. Old data is stored in second-level storage tier ). Some products can separate Level 2 storage to help implement RTO. This process involves data classification, which means that the information in the primary storage tier is the most important and required by RTO. The information of level-2 storage is not as important as that of level-1 storage. It can be restored later. Old data is not necessary information for daily business, but it is also necessary information. It can be restored a little later. Let's look at an example. A disordered data volume usually uses only 20% of the data, and 80% of the data is accessed six months or even longer ago. Using Storage Classification can increase the recovery speed of major data by five times and easily implement business RTO. However, do not forget that there is still second-level storage data. It still needs to be restored, but it can be later.
Unfortunately, the number of data storage hierarchical solutions that allow the recovery of level 1 and level 2 data is limited. Therefore, it is necessary to verify with the Hierarchical Storage Service Provider to ensure that level-1 data can be restored independently without level-2 data. In the near future, more Storage Classification solutions will emerge.
Data Management Policy
The company's data management policies can also help implement RTO. In terms of concept, the data management policy is similar to the storage classification, which includes the company's old data deletion policy-some old data can be deleted from the database after being stored on tape, DVD, and other media. This policy controls the size of an active data volume by removing old data. Old data and database records that exceed a certain period of time are automatically deleted. Based on the Data Type and importance, policies must be decisive in data processing. Although many financial records must be permanently retained, most records do not need to be retained on active storage for more than three years. A data document is enough. Generally, end users' office production data is retained in active memory for less than 18 months. These policies depend largely on the business scope and other institutional requirements, so these methods may not be applied in all circumstances.
The disadvantage of a data management policy is that it requires certain management and audit. Deleting archived data may cause some system problems. If it is a legal reservation, Federal Rules of Civil Procedure rejects the deletion. In addition, the search engine technology may make all the data appear to have been recently accessed. The search engine must open and read the entire file to create an index. Therefore, make sure that your search technology records the files that have been indexed. Otherwise, the system will never find files that have not been accessed recently.
If the data growth rate is so fast that these measures can only win a small amount of time, you will need to execute other optional technologies to ensure RTO is satisfied.