With Intel's drive, IT system communications bandwidth and computing power follow Molfa to record highs, maintaining a doubling of growth rates every 12-18 months. At the same time, IDC's latest "Digital Universe" study predicts that data will grow faster than Moore's law, reaching 1.8ZB in 2011, and that the company will manage 50 times times the current amount of data over the next 10 years, and file volumes will increase 75 times times. Under the background of the rapid expansion of the digital universe, the concept of "big data" came into being.
Detailed Data
In fact, large data and cloud computing are the two associated concepts, although the industry temporarily does not have the official definition of large data, but in fact, the manufacturers of large data understanding has reached consensus.
EMC Information Infrastructure Products Director and chief Operating officer Pat Gelsinger that: Large data should include three elements, first of all, large data is a large dataset, generally around 10TB scale, and sometimes set together a number of datasets will form a PB data volume. Secondly, these datasets often come from different applications and data sources, requiring the system to integrate semi-structured, unstructured and structured data well, and finally, large data has real-time, iterative features.
Benjamin Woo, vice president of IDC's global storage and large data project, proposed that there are four basic elements of large data, Volume, produced, Velocity, and value. First, the data is massive in volume; large data is a huge dataset provided by a large number of people with a variety of features, and the value of these data is very high, both for companies and for individual users around the world, and in addition, the speed at which data is expected to be obtained is very fast from system requirements. So use four V to generalize large data characteristics.
In addition, EMC has a deeper interpretation of the relationship between large data and clouds: The Big data and the cloud are two different concepts, but there are many intersections between them. The underlying principles of supporting large data and cloud computing are the same: scale, automation, resource allocation, self-healing, so there is actually a lot of synergy between large data and clouds.
"When we build our cloud facilities, we think about what kind of apps we should run on the cloud, and the big data is a very typical application that runs on the cloud," he said. For example, although e-mail is one of the applications on the cloud, it can also be detached from the cloud architecture, but large data applications must be architected on the cloud infrastructure. This is the relationship between the two-big data is inseparable from the cloud. "said Pat Gelsinger.
Traditional storage bottlenecks
Today, the concept of large data has become increasingly clear, but the solution to the storage of large data is still a problem for every user. Not only that, the entire IT domain technology has developed rapidly, many new technologies and new architectures before the 20 are now facing obsolescence and even disappearing into the vastness of technological development, and many of today's new technologies will face the same fate 20 years later, and the technology change in storage has been particularly pronounced in any other area.
Key technologies in the storage area SAN and NAS architectures have now developed for nearly 20 years and have replaced Das as the mainstream standard architecture for enterprise storage since 10 years ago. However, SAN and NAS platforms are essentially improvements to DAS and do not break the bottleneck of traditional storage technologies. Traditional storage architectures still have fundamental architectural flaws:
First of all, the traditional storage architecture is static, its design is inherent in the scalability of the deficiencies, in the expansion, often only the number of disk expansion, Backplane, memory and processor resources can not be extended. If an enterprise wants to meet the growing capacity and performance requirements, it will have to spend a lot of costs, and the data risk is increasing. The final result is that users need to manage more and more complex storage, but the required organization and staffing is not sustainable growth.
Volumes are the most basic part of a variety of storage technologies, providing data services for users ' front-end applications, and the most obvious indication of the need for new storage patterns from the application mode of storage volumes. In an ideal "cloud" system environment, volumes should be flexible and free, and it is difficult to find a reason to limit the data to a specific location. With sufficient security and reliability, individuals and applications should be able to easily access files and folders from any geographic location, just as the data is local, and the corresponding storage volumes should grow seamlessly as the application scale grows.
The fact of the matter is that the storage volumes do not migrate freely between devices in a variety of ways, and the expansion and contraction of the storage volumes is clearly not as flexible as we might think. When storage volumes are limited by reliability issues, technical limitations, or performance, the ultimate problem for users is inefficiencies. These fixed resource sets should fully realize their full potential,
In addition, another important challenge for traditional storage environments is waste; Many storage vendors believe that up to 50% of the resources in the user environment are underutilized. This, of course, is beneficial to the storage vendor, but will lead to waste of power, cooling and management for the user.
The inherent bottlenecks make traditional storage even more stretched in the face of big data challenges.
(editor: Heritage)