Analysis: Data integrity monitoring techniques in cloud storage

Source: Internet
Author: User
Keywords nbsp; lost cloud-stored provided all

Cloud storage is a very attractive service for outsourcing day-to-day data management, but once data is lost, all the consequences will be borne by the company that owns the data, not by the hosting service provider. With this in mind, it is important to understand the reasons for the loss of data, how much responsibility the cloud service provider is responsible for, how to secure the use of cloud storage, and the integrity monitoring methods and standards regardless of whether the data is stored locally or in the cloud.

Integrity monitoring is essential in cloud storage services, and data integrity is also the core task of all data centers. Data corruption can occur at any level of storage and any type of media. Bit decay (weakened or lost data on storage media), controller failure, data deduplication corruption, tape failure is the main factor that causes data corruption in different media types. Metadata corruption is a direct result of these failures, such as bit decay, and is also extremely susceptible to software failures other than hardware error rates. Unfortunately, one side effect of duplicate data deletion is that corrupted files, blocks, or bytes affect each piece of metadata associated with it. In fact, data corruption can occur in any part of the store that is bad. Migrating data to different platforms can easily be corrupted, migrating data to the cloud. The cloud storage System is also a data center of hardware and software that is vulnerable to attacks that can result in data corruption. such as the recently known Amazon cloud Downtime event. Many enterprises are not only affected by the long downtime, in fact, their 0.07% of customer data has been lost. It is reported that the cause of data loss is "Amazon ESB volume ... Inconsistent data snapshot recovery. "This means that the data in the Amazon system has been corrupted, so customer data has been lost." Whenever data loss, especially important data loss, people tend to blame each other to shirk responsibility. In the IT industry, this usually leads to staff being dismissed, the company suffering huge economic losses, and even the worst case of corporate bankruptcy. Therefore, the key is to understand the legal liability of cloud service providers, and each service level agreement (SLA) has taken all possible measures to ensure data security and prevent data loss. In many legal documents, SLAs tend to be in the interests of the provider rather than the client. Many cloud service providers provide different levels of data protection, but all storage vendors are not liable for data integrity.

Cloud SLA protocols, including the protection of cloud providers, make it clear that data loss or corruption is the most common scenario. For example, Amazon's customer Web services Agreement, which stipulates, "We ... does not provide any form of declaration or warranty, the services provided or third-party content is uninterrupted, error-free, faulty parts, or any content ... will be safe, not lost, or undamaged. "This agreement even suggests that customers" frequently file "their data. As mentioned earlier, the integrity of data management, whether in a data center, a private cloud, a hybrid cloud, or a public cloud, is always the responsibility of the actual owning company of that data.

Some of the best practices in common will allow companies to take advantage of the flexibility and accessibility of the cloud without compromising their data security. Spread risk in the context of data protection, minimizing the possibility of data loss. Even if you are storing data in the cloud, it makes sense to keep a backup copy of the master copy and the field data, so that accessing the data does not depend on network performance or connectivity. Adhere to these basic best practices, understand the details of the cloud service provider SLA, and properly build the module to proactively monitor the integrity of the data, whether it is stored in the cloud or locally.

One way to verify the integrity of a set of data is based on the hash value. A hash value is a unique value obtained by compressing a set of data in a predefined manner. Because the hash value is obtained from the original data itself, if the two hashes are not identical, it means that at least one of the two replicas has been changed or corrupted.

Ensure that the cloud provider can provide a hash of the data and compare the hash value of the second copy of the data, regardless of when and where the copy is stored.

Manual data monitoring At this level will be cumbersome. Fortunately there are other methods available, including title checking. Spectralogic Inc. and other dynamic Archiving Alliance (activearchivealliance) members provide data integrity tools for automated monitoring systems.

While dynamic archiving is one of the ways to monitor data integrity, it still requires a widely used cloud standard protocol to support its integrity monitoring and interoperability. Interoperability between different storage devices is critical because not all data centers or cloud hosting infrastructures use the same standard equipment. The Cloud Storage Management Interface (CDMI) standard was proposed by the Global Network Storage Industry Association (SNIA) in 2010. A compatible CDMI system might query the hash value of an object for another compatible CDMI system to verify that two copies of the data are the same. By monitoring the integrity of master data replicas and backup copies, businesses can verify that the data copies stored in the cloud are corrupted. These datasets can be monitored frequently through data values. Industry standards, such as CDMI, not only ensure interoperability between heterogeneous compatible systems, but also provide a convenient mechanism for data integrity monitoring.

It is hard to see the cloud industry appearing in the media lately, especially after Ironmountain companies stopped their most basic cloud storage services and the Amazon downtime that was discussed beforehand. However, the purpose of this article is not to discuss the wisdom of cloud storage platforms, but to consider more factors when researching and implementing cloud strategies, rather than simply considering storage costs per gigabyte. If cloud storage is implemented correctly, it will provide many benefits to all enterprises. Eliminating cloud disadvantage requires intelligent data management strategy. No matter where or how data is stored, its accessibility and recoverability are absolutely critical when needed. This commitment is the core task of all data integrity monitoring and validation.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.