Today, big data is becoming an important tool for many enterprise units, and as the data itself accelerates, the user's deployment of storage and data management is becoming increasingly important. And because of the challenges that users face, such as implementing analysis tools and controlling large data files, they also need to find more appropriate storage scenarios.
Using meta data and policy management
Policy management is another important function, even if the metadata to implement or drive some features. This also provides a flexible structure for unstructured data and eliminates constraints or constraints associated with structured data management.
Find the right medium.
Finding the right storage medium can help users meet their needs. Hard drives (HDD) have long been a popular way to provide balanced performance, capacity, storage density, and cost efficiencies for many applications. This trend will continue as users need to save more data over a longer period of time.
Large data can also benefit from today's solid-state drive solutions that use dynamic random access memory or NAND flash memory-or a combination of both-to support bandwidth requirements. SSD can be used to store meta data and other frequently accessed data. The former "veteran"---tapes will also play several roles in large data, including the time to migrate large amounts of data, provide archiving, or provide backup for data on disk.
Reduce resource consumption for large data
Duplicate data deletion is not always an effective way to maximize large data capacity. Users can consider other tools and techniques to mitigate the pressure of storing and protecting the growing dataset.
Rethinking how, when, where, and why data is protected is another way to reduce data consumption. Data compression (real-time or asynchronous), using different compression algorithms to reduce storage requirements is one of the techniques to reduce data consumption.
Consider storage-system options
Some large data solutions for analytical tools take the cluster or grid configuration, internal or private storage, and industry standard x86 or IA64 servers for application software. Large data applications can also leverage existing storage systems that are optimized for different usage scenarios. Some storage systems for traditional high-performance computing may be appropriate for bandwidth-intensive concurrent or parallel access applications that use block or file access.
Protect and serve large data
Protecting large data requires basic reliability, availability, and serviceability. Users must also ensure the integrity and durability of the data, and perform backend data checks to detect contingencies such as check codes or protection errors and bit corruption. These backend checks must be transparent to the normal operation and must be corrected before they develop into problems.
Users are required to re-examine the level of RAID (redundant array of independent disks) to optimize their large data storage solutions. Factors to consider include how many drives are in a RAID pool or group, the block or I/o size, and the size and type of devices being used, and which can be optimized to accommodate smaller amounts of data.
(Responsible editor: Fumingli)