The industry defines cloud computing in a variety of ways, but, as seen from several well-known cloud computing models, storage is a fundamental support component that is not in any way open, and cloud storage, as one of the branches of cloud computing services, mentions storage as the primary location. At the same time, there are not many vendors who can solve storage problems well in the underlying architecture, and storage is facing a number of bottlenecks to achieve cloud computing.
At an Intel Media training camp, Intel (China) Limited server platform Product manager Chang said: "Today we talk about cloud computing and virtualization in the cloud architecture, storage is a very difficult problem." ”
Addressing the challenges of unstructured data growth
In an Intel-planned cloud storage system, the actual storage requirements of the user are divided into two types, what we call structured and unstructured data.
Structured data is generally stored in databases, often called database data, and is often based on this type of data in business-critical applications such as Oracle and SAP. This type of data can be expressed and implemented by using the two-dimensional table structure logic of the database. Each read the data block is often not large, generally 4K or 8K, but read and write is often very frequent, because each read and write will bring hard disk head to track the read and write delay, so the traditional storage system is often used in large-scale concurrency, And a large number of reading and writing optimization to ensure the access requirements of structured data.
In addition, as the key business application of the data storage cornerstone, data security must be guaranteed, and therefore, the storage system designed for structured data storage also often uses a large number of data security protection measures to protect the enterprise key business operations data security.
The IOPS performance metrics that measure database literacy have been the ultimate pursuit of enterprise storage System design, however, as cloud computing is increasingly pervasive, social networking is booming and mobile internet and IoT are thriving, and users are suddenly discovering that structured data in the past has evolved into unstructured and semi-structured Big data is another challenge to traditional IT systems.
The figure above is a forecast of data growth trends between 2010 and 2014 published by IDC, where the bottom yellow block represents the growth of structured data generated by traditional enterprise databases, with an annual growth rate of only 23.6%; Red data on yellow data, Represents backup data from the enterprise system and data Warehouse, as can be seen from the graph, its growth trend is not obvious, the annual growth rate is 24.2%; the gray blocks above the red data represent the growth trend of unstructured data such as archiving, which can be seen as a noticeable increase and an annual growth rate of 54.8% per cent; The top green section is growing fastest, with the annual growth rate of 75.6% per cent, which comes from the content warehouse, which includes a variety of file data from applications such as Web, e-mail, social networking, and document sharing.
Three Big cloud storage solutions
With the explosive growth of the data universe, the traditional storage system designed for structured data storage has been unable to cope with the huge data storage requirements of cloud Platform system, in this context, the cluster storage has ushered in its development peak.
cluster storage through concurrent Distributed file system and algorithm, workload distribution to the nodes in the cluster storage mode, each cluster storage nodes to coordinate and unified combat, thus achieving the 1+1>2 effect, while the cluster storage provides a single interface and interface, Enables users to easily use and manage all data. For cluster storage, a single data node is the hardware base of the Distributed file system and management software, whose performance and reliability directly affect the overall performance of the storage System platform.
For the application of unstructured data in different cloud storage environments, Intel proposes three different solutions for application optimization.
A large object storage usage pattern
Object storage is typically used for data storage in the content warehouse, which typically uses object storage patterns to store large amounts of file data in Web, e-mail, social networks, and document sharing systems. This type of storage requirement is generally not rigorous for system performance requirements, but it also requires a certain response time and response speed. In addition, considering the large-scale construction of the system, the overall system of the system energy consumption, cost will be a certain balance.
In response to the need for object storage for node hardware design, Intel recommends the use of the Xeon E5 processor product family, the Xeon E5 processor is Intel's critical innovation for the dual server market, with a new Sandy Bridge microarchitecture that supports up to 8 cores, Intel demonstrated engineering samples using the Sandy Bridge microarchitecture Xeon (Xeon) E5 processor as well as the Xeon E5 server system at the Intel Fall IDF Conference, soon after the closing of San Francisco. The diagram above provides the recommended configuration for this system node.
Ii. Backup, archive storage usage patterns
Compared to object storage usage patterns, backup and archive systems are more relaxed about data response latency, where users are more concerned with data reliability, energy consumption, and unit storage space costs, and for this type of storage usage, Intel recommends a E3 processor based on Xeon and Intel Celeron/ Core i3 processor Series node optimization program.
Iii. Large-scale analysis (Hadoop) Usage patterns
Hadoop is often used for data analysis and processing of massive files, which often require fast response times and strong processing power, and the above diagram is the node optimization architecture recommended by Intel based on the E5 processor family.
(Responsible editor: The good of the Legacy)