Managing the byte-level data storage associated with large data is an entirely new way to manage the traditional large data infrastructure. Currently online photo-sharing site Shutterfly manages 30 bytes of data. Shutterfly here to share with us their experience of taming the "Data Beast".
At present, everyone is talking about large data analysis methods and related business intelligence results. But before companies can take advantage of this data, they have to figure out how to solve storage problems. Managing byte-level or even larger-scale data storage and management traditional large data sets are fundamentally different.
Shutterfly for an online photo-sharing site. They do not limit the number of photos the user stores, and allow users to store photos of the original resolution, which can be stored without reducing the resolution of the photo. Shutterfly also promised never to delete any photos.
Shutterfly Senior vice president and chief Technical officer Neil Day said: "Our photo archive size has exceeded 30 bytes." The storage pool is growing much faster than the user base. The first thing they do when we attract customers is to upload a lot of photos to us. After they liked us, the first thing they did was to upload a lot of photos again. ”
To have an intuitive impression, you can imagine that 1 bytes equals 100 bytes, which equals 1 million gigabytes. The NASA Hubble Space Telescope's 20-year observation data archive is only a little more than 45 bytes. 1-byte audio data compressed at 128 KB rate can store audio for 17,000 hours
The characteristics of the byte-level infrastructure
"Byte-level infrastructure is completely different from traditional storage facilities," Day said. It is very difficult to build and maintain them. The management pressures of a byte-and-byte infrastructure are vastly different from the traditional management pressures of large datasets. The difference between them is like processing data on a portable computer and processing data on a RAID array. ”
Day entered the Shutterfly website in 2009. Storage has become the company's biggest expense, and the cost is still growing at a rapid pace. The growth of these costs is not based on the growth of the original capacity, but on the growth of the number of employees.
"Storing every n bytes per increase means we need to add a storage administrator and the corresponding physical and logical infrastructure," Day said. With these large stores, the volume of data continues to record highs. People who actually manage these large archives must constantly troubleshoot hardware problems. Everyone wants to solve this basic problem because everyone knows that any failure on the hard drive will cause a service outage. So how do you make sure that the data is continuously available and that performance does not fall?
Extending raid is a tricky issue
The standard solution for failure redundancy is to replicate, usually in the form of a RAID array. But if the scale is large, then the problem of RAID will be more than the problem solved. In traditional RAID data storage scenarios, for completeness and availability, copies of each data are mirrored and stored on different disk arrays. However, this also means that each stored and mirrored data will result in a five times-fold expansion in storage capacity. As the disk capacity in the RAID array becomes larger, the time required to restore the contents of the failed disk to the new disk is growing.
"In fact, we have no operational problems on raid," Day said. The problem we have is that the disk capacity is growing, and if the disk component fails, the time to switch to the redundant system will grow longer. Parity is proportional to the size of the generated dataset. We have now started using 1-byte and 2-byte disks in our infrastructure, which results in a longer transition to a redundant system. The current development trend is not moving in the right direction. ”
Reliability and availability are critical to Shutterfly, as is the need for enterprise-class storage. Fast-growing storage costs make commercialization services increasingly attractive, Day said. When Day and his team invested in potential technology solutions that allowed Shutterfly to control costs, they were exposed to a technique called erasure code. The technology aroused their interest.
Creating next-generation storage using erasure codes
Reed-solomon was originally used as a predecessor error correction (FEC) code to send data in an unreliable channel, such as a data transfer in a deep space probe task. The technology is also widely used on CDs and DVDs to deal with damage to discs such as dust and scratches. Several storage vendors are now collaborating to integrate the Erasure code into their solutions. By correcting the code, the data segment can be decomposed into several small pieces, each small piece of data itself is useless. The data blocks are then dispersed to different hard drives or servers. Only a partial block of data is needed, and the data can be recovered at any time, even when the hard drive fails to cause multiple blocks of data to be lost, the data can still be completely pieced together. In other words, you no longer need to create multiple copies of the data, and a single instance can guarantee the integrity and availability of the data.
Cleversaf, based in Chicago, is one of the earlier companies involved in the Erasure code solution. The company also developed a decentralized code technology by adding storage cell information. This technology allows users to be geographically independent, such as storing blocks of data, or fragments, on multiple data centers.
Since each fragment is useless in mathematical sense, this makes it private and secure. Unlike RAID requirements for multiple copies, information technology uses only one single instance of data, and the company may save 90% of storage costs to ensure that data integrity and availability are only scaled down to the minimum limit.
Russ Kennedy, vice president of product strategy, marketing and customer solutions at Cleversafe, said: "You don't have to have every chunk of data when you re putting pieces together." The number of blocks generated is what we call the breadth, the minimum block of data needed to recover the data, which we call thresholds. The difference between the two determines the reliability of the data. When you lose both the node and the hard drive, you can still recover the original data. The highest reliability you can get in a raid is a double parity check, and you can lose two hard drives. But in our solution, you can lose up to six hard drives. ”
The erasure code is a software based technology, which means that the technology can use commercial hardware to further compress the extended cost.
(Responsible editor: The good of the Legacy)