Large data storage platforms must be resilient

Source: Internet
Author: User
Keywords Large data storage large data must

At present, there is a lot of discussion about "big data", and the accepted view is to generalize the series of large data: massive data Scale (volume), fast data flow and Dynamic Data system (velocity), various data types (produced) and great data value (value), and our response. But in fact, the big data should first consider the "big"-massive data scale.

Big Data "big"

"Big" is a relative concept. For example, for a "memory database" like SAP HANA, 2TB may already be large, and for a search engine like Google, the amount of EB's data can be described as big data.

"Big" is also a rapidly changing concept. The USP storage virtualization platform released by HDS in 2004 has the ability to manage externally attached storage within 32PB. At the time, most people thought that USP's storage capacity was a little bit too big. But now, most enterprises have PB-level data, some search engine company's data storage even reached the EB level. Some cloud companies are promoting their file-sharing or home data backup services, as many families keep terabytes of data.

The capacity is "big"

In this view, the primary requirements of large data storage storage capacity can be expanded. Large data requirements for storage capacity exceed current user storage capabilities. We are now in the PB-class era, and the EB-era is coming. In the past, many companies typically use five years as a cycle of IT system planning. In these five years, the storage capacity of the enterprise may increase by one times. Businesses now need to develop growth plans that store data levels, such as from petabytes to EB levels, to ensure uninterrupted growth of the business. This requires the implementation of storage virtualization. Storage virtualization is by far the most important and effective technology to improve storage efficiency. It provides existing storage systems with tools for increasing storage efficiency, such as automatic tiering and streamlined configuration. With virtualized storage, users can consolidate structured and unstructured data from internal and external storage systems onto a single storage platform. When all storage assets become a single pool of storage resources, automatic tiering and thin provisioning can be extended to the entire storage infrastructure level. In this case, users can easily maximize capacity recovery and capacity utilization, and extend the lifetime of existing storage systems, significantly improving the flexibility and efficiency of IT systems to meet the need for unstructured data growth. Midsize enterprises can extend the capacity of HUS to nearly 3PB without impacting performance, and can implement fast configuration of system through dynamic virtual controller. In addition, through the virtualization capabilities of HDS VSP, large enterprises can create 0.25EB capacity storage pools. With the rapid growth of unstructured data, how can file and content data be expanded in the future?

Growing Big Data

Unlike structured data, many unstructured data needs to be accessed through Internet protocols and stored in a file or content platform. The storage capacity of most files and content platforms used to be only terabytes, now needs to be scaled to petabytes, and the future will expand to EB level. These unstructured data must be accessed in the form of a file or object. Traditional file systems based on UNIX and Linux typically store files, directories, or information related to other file system objects in an index node. The index node is not the data itself, but the metadata that describes information such as data ownership, access mode, file size, timestamp, file pointer, and file type. The limited number of index nodes in the traditional file system limits the number of files, directories, or objects that the file system can hold. Hnas and HCP use object-based file systems to enable their capacity to scale to petabytes and can hold billions of of files or objects. Hnas and HCP gateways located on VSP or HUS not only take full advantage of the scalability of module storage, but also benefit from the common management platform Hitachicommand Suite. Hnas and HCP provide an excellent framework for storing large data. Large data storage platforms must be able to continuously expand without interruption and have the ability to span technology across different eras. Data migration must be done in a minimal range and in the background. Large data can be well recoverable if it is replicated once. Large data storage platforms can track changes to data through versioning, without having to back up all of the data once the large data has changed once. All HDS products enable background data movement and tiering, and increase the capacity of VSP, HUS data pools, Hnas file systems, HCP, and automatically adjust the layout of data. Traditional file systems and block data storage devices do not support dynamic expansion. Large data storage platforms must also be resilient and do not allow for any single point of failure that may require rebuilding large data. HDS enables the redundant configuration of VSP and HUS and provides the same resiliency for Hnas and HCP nodes. Large data storage platforms need to integrate files, block data, and content into a unified Hitachicommand suite management platform to meet the needs of large processing and applications.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.