Behind Big Data Solutions-open architecture is the future

Source: Internet
Author: User
Keywords Large data servers extensions more and more solutions

How fast is the tide of big data? IDC estimated that the amount of data produced worldwide in 2006 was 0.18ZB (1zb=100), and this year the number has been upgraded to a magnitude of 1.8ZB, which corresponds to almost everyone in the world with more than 100 GB of hard drives. This growth is still accelerating and is expected to reach nearly 8ZB by 2015. For now, large data processing is facing three bottlenecks-large capacity, multiple format and speed, and the corresponding solution is proposed, which is extensibility, openness and next-generation storage technology.

Capacity-high scalability

Data is being scaled from TB to petabytes or even EB-level, and more and more business data from people and machines are creating more challenges to IT systems, and data storage and security, as well as the future access and use of these data, have become difficult.

So what should be the future system architecture? Traditional system architecture, whether the earlier monolithic or the current modular architecture, are based on scale-up design, this traditional model is unavoidable to the storage system will encounter performance bottlenecks, the storage system will inevitably appear performance inflection point. Coupled with the current information environment determines the amount of user data in a fast-growing state, users of the function and scalability of the demand is increasingly strong. The traditional storage architecture physical components and logical constraints have their limits (such as number of disks, number of servers, cache size, number of controllers, etc.), which determines the scale-up architecture has its great limitations.

Therefore, faced with large data, a highly scalable scale out architecture is a necessary requirement, more and more enterprises are beginning to adopt open architecture, and scale-out storage plus VM on x86 environment to achieve server consolidation.

Currently, EMC has migrated core software from the traditional high-end Symmetrix DMX series that has been in operation for years to the open hardware platform, turning CPUs from PowerPC to Intel x86 and launching a new generation of scale-out high-end storage-system Symmetrix V-max HDS also migrated its traditional high-end storage USP V to an open hardware platform and turned it into a VSP storage system. This also shows that the Scale-out architecture will play an increasingly important role in future storage systems.

Multi-format--openness

Large data includes more and more different formats of data, and these different formats require different processing methods. From simple emails, data logs and credit card records, to scientific research data, medical data, financial data, and rich media data (including photos, music, videos, etc.) that the instrument collects.

For the system architecture, different software is required to process different data, and if the system is locked on top of a certain manufacturer, it will bring great difficulties to the subsequent expansion.

In fact, in both server and storage, products with traditional RISC architectures do not have good scalability, and like X86 servers, cluster NAS, cluster storage products are highly scalable and can meet the flexible requirements of private cloud for extended space. For example, EMC, VMware, Cisco Joint launch of the Vblock products, can help groups of users to achieve flexible flexibility requirements. Therefore, flexible users to achieve on-demand add or reduce IT resources, architecture flexible private cloud environment is an important symbol.

In the construction of the ecological chain, the advantages of open architecture are further reflected, in the open X86 architecture with a variety of open source software components of the future large data processing architecture has become the consensus of architects, due to the openness of the x86 platform and the huge mature software biosphere, As a result, Intel based X86 servers have more platform advantages and potential than any previous platform. This is why many Open-source software such as Hadoop, MongoDB, Redis, Xen are so popular with system architects.

Speed-the next generation of storage technology

This speed is mainly refers to the data from the endpoint to the processor and storage speed, when the enterprise began to use more and more virtualization in the large data architecture, computing density will increase significantly, the burden of system I/O will become heavier, and SSD hard disk is a new way to solve this problem.

In fact, servers with SSD are not surprising, the server I/O acceleration technology being developed by Intel, EMC, NetApp and so on, is to push the tiered storage architecture further to the server side, put the cache of the storage device into the server, and make it a manageable part of the storage device, This allows the storage device's cache to be closer to the processor computing core for overall energy efficiency improvements.

Of course, the number of erasable times has been the mishap of SSD disk, however, there have been a number of technical implementations to address this problem, with the intel® het series, for example, combining the improved and unique solid-state hard drive NAND management technology on the NAND flash presence to extend the write durability of MLC based solid-state drives. Intel-developed firmware, controllers, and high cycle NAND disk array to cope with the heavy data processing and write load in the 24/7 datacenter or scientific, financial, and other high-density usage patterns. Enhancements to Intel's firmware include optimized error avoidance techniques, reduced write amplification algorithms, and system-level error management beyond common error checking and correction (ECC) standards in the industry.

(Responsible editor: Lu Guang)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.