In the past few years, there has been a growing interest in data archiving. Because of the explosive growth of corporate data and the need to keep data for a long time to meet some regulatory requirements, and to reduce costs, data archiving is becoming more and more important. Now data archiving has evolved into a cloud-based archiving solution.
As we all know, approximately 20–30% data in the network is archived data, where the percent data is static or inactive, and these static data are rarely changed and rarely accessed. Storing these inactive data in the first tier is a very expensive and inefficient method. Nevertheless, it is often necessary or desirable to retain these inactive data for future reference or compliance with some regulatory requirements. It would make sense to keep them in a cheap, usable medium on the premise of ensuring data security and compliance.
The usual way to meet these requirements is through archiving. Unlike backups, archives move inactive data from primary storage to another, easily accessible, lower-priced tier two storage layer, and then delete it on local disk resources. This reduces costs by freeing more space for expensive primary storage, reducing backup windows, increasing the efficiency of operations, and providing long-term reliability of data protection. A viable, efficient data archive should be available:
• Scalability
• Cost effective
• Usability
• Long-term data protection
In this article, we'll take a closer look at the evolution of the data archiving process and the different methods of data archiving:
1. Traditional tape archiving
2. disk-based Data archiving
3. Cloud Storage Archiving
Traditional archive
This backup archive is a traditional tape-based archive. As part of the backup process, the backup software or system device is used to write data to tape or to an automated tape library, and data is saved from disk to tape. These tapes and the data they contain are separated by normal backups and by specifying a lengthy retention period, usually from 10 to infinity. When a data backup is removed from the server's disk drive, the archived tapes are sent to the offsite storage device for permanent storage.
The advantage of this approach is that tapes are relatively inexpensive, easy to manage, long to save, and provide very reliable storage, which is very efficient at storing large amounts of data. To increase your storage capacity, you simply need to add more tapes. You can also provide additional data redundancy by creating a copy of the data or a primary data cartridge with some type of backup device.
The downside of tape is that you have to wait for tapes to be retrieved from offsite before you can retrieve them, and then it takes time to scan the tape, find the data that needs to be recovered, and recreate the tape to store the data. In addition, it restricts the explosion of data growth, resulting in the previous reduction of backup windows, the ability to limit retention, there is no practical means to verify the integrity of tape media and its data storage life.
disk-based Data archiving
Over the past few years, the volume of data has grown dramatically, and the demand for storing and accessing large amounts of archived data in today's business environment is growing. Mostly new compliance requirements, such as Sox, with globalization and the disintegration of traditional corporate structures, no longer have one or two centralized locations, and now companies have multiple offices, dispersed in different regions, need quick and easy access to a large amount of archival information, research and other business cooperation, This created a disk-based archiving system.
To meet these new requirements, such as access to data more efficient, storage of large growth of new data, including e-mail, databases and so on. Companies are starting to look for other storage to meet these needs and effectively control costs.
The first step in archiving evolution and modernization is the deployment of disk-based solutions. Use inexpensive shelf hardware and SATA drives and inexpensive NAS devices. So companies can keep their data in their convenient, fast-access sites. However, these early implementations do not fully meet the unique requirements of their archive, such as scaling to PB to accommodate large data growth, and lengthy data retention periods that provide methods to protect and ensure data integrity beyond basic RAID 6 to meet compliance requirements. They also lack the ability to automate the management of archiving processes.
This brings you to the next step, which is the introduction of a specific filing system that manages the data archiving process. These systems provide fast, manageable storage, easily scalable in capacity, and the use of the tools and software necessary to manage the archiving process. These systems also provide data protection features beyond RAID6, data retention, data validation integrity, and worm (once written to multiple access) capabilities.
The advantage of this archiving approach is that data replicas can be stored online and easily accessed quickly. This eliminates the hassle of tapes being retrieved from offsite sites, and does not require specific hardware or backup software to store data from tape. It is also very easy to retrieve and search for specific data on a disk. You can also move data from one place to another through simple replication requirements. It is easy to expand the capacity of these systems to meet the need for data growth. The greatest benefit is the reduction of primary storage requirements, thus avoiding the frequent purchase of additional expensive primary storage.
The disadvantage of a disk-based archive for an enterprise is that when a disk is purchased, the enterprise has to purchase approximately 50TB of disk at the outset. For many businesses, it is neither practical nor cost-effective, they buy so much disk space that they may initially need only one to two TB. In fact, you have to pay for storage disks that may not be fully utilized for years. There are also many attendant costs of power, refrigeration, management, maintenance and upgrades of these systems and their support architectures.
Cloud storage Archiving
Faced with less budget and fewer employees, and a growing demand for data storage, companies are looking for other, more cost-efficient ways to expand their storage capacity. Today, businesses are focusing on the latest developments such as cloud computing, storage as a service (SaaS) as a possible way to meet their growing storage needs, while meeting the need to reduce costs, people, hardware, architecture, and so on.
Service providers in this new area of cloud storage provide virtually unlimited, scalable storage for the enterprise, a fixed cost on an application basis. This allows organizations to extend their storage on a need basis without worrying about the usual costs, typically by expanding the disk environment, such as creating more schemas, hiring, training more people to manage additional storage, and increasing refrigeration and power costs for additional storage. Among other benefits, this service model also provides a geographically multiple-site architecture. It allows multiple site businesses from different regions to access their data from the network at any point in time. All of this access is transparently provided to the user and the person who appears. When providing secure connectivity to all data transmissions these solutions are often easy to integrate with the existing architecture and applications of the enterprise.
These services also provide the ability to store data on data retention regulations. The main vendors of cloud storage archiving are the Iron Kings virtual file System (VFS), nivanix,rackspace hosting Mosso Cloud Division,vaultscape and Amazon S3 Services. Iron Kings has the most experience in storing, protecting, and archiving information.
The benefit of this type of service is that you can almost instantaneously expand your storage without generating hardware or extending the network architecture, hiring, and training more people to manage the extra storage capital expenditure. You also avoid the cost of upgrading and updating storage hardware over time.
The disadvantage of this type of service is that the company's data is stored on other people's systems, not the company's local system. So when you turn to the cloud, you need to carefully analyze what type of data is archived to the cloud, and how to protect the data. In this case, it is important to find a company with a good record of archiving and secure data storage.
Because improved access and reliability of data archiving makes users more likely to migrate data from primary storage. This not only reduces the cost of primary storage, but also reduces the cost of data protection and disaster recovery.