Storage is the root of big data

Source: Internet
Author: User
Keywords Large data Intel can some

With the wide application of the technology of Internet of things, social and BYOD, the data show explosive growth. Not only is it challenging storage performance and capacity, but fast data retrieval and analysis capabilities to instantly capture critical value information, and active data archiving requires simpler, more cost-effective storage scenarios. "In the foreseeable future, storage is one of the biggest infrastructure expenditures in the large data and analytics sectors," said US market research firm IDC.

The data for Hadoop is a very important point. Generally there are three ways to deal with processing and storage, one method is real-time analysis tools, while there are some operational work of data. So what exactly did you get? It also requires manipulation of data, which requires processing a variety of different data to get a result, which is where Hadoop overtook other tools in the early days. And we have to process the data, use this data to refer to it in the training process, or reference it to make some visualization and help to use some.

Storage and networking are also important guarantees for the performance of Hadoop clusters. In the Hadoop cluster, bandwidth growth from Gigabit Ethernet (10GbE) is key to importing and replicating large datasets (across multiple servers), and the intel® ethernet10gigabit converged network adapter provides high throughput connectivity, At the same time, Intel SATA solid-state drives provide high performance, high throughput storage options for raw storage. For efficiency, storage often needs to support other advanced capabilities, such as compression, encryption, automatic data tiering, data deduplication, erasure codes, and automatic compact configurations, which are already supported by existing Intel Xeon processors.

With the addition of a large number of it vendors, the commercial version of Hadoop is growing, with many vendors rolling out their own versions of Hadoop and assembling the basic stacks of other Hadoop projects that integrate with data warehouses, databases, and other data management products.

Make Hadoop the cornerstone of the next generation of data analysis platforms. The intel® Hadoop release Free Edition v2.2 provides a powerful, easy-to-use, large data entry platform for end users and application providers. and the free and enterprise versions share the same core code, and the free version also contains all the core enhancements, but the free version limits the number of nodes and the storage capacity of the system.

The version of the Intel large data Hadoop has 4 features that are optimized for stability and ease of use. The second aspect is a special optimization of Intel's platform, which has the advantage of performance and efficiency on the Intel platform. Third, the algorithm and structure of the adjustment, that is, the optimization of real-time, so that it can do real-time data processing. The forth part is to cooperate with Chinese users and make special adjustment and optimization to the industry application.

The most important thing is security optimization, to run a variety of different runs, in an app environment Hadoop is a separate cluster, perhaps it is not so easy to manage, although the efficiency is not very high, but it is run independently, put it together can see is a single data sharing. We saw a lot of data and put it in the cloud computing, for example, the infrastructure might be sharing resources with the same infrastructure. But in terms of sharing storage, there are some big data that can have some advantages, but that's not exactly what the problem needs. Sharing of resources, sharing of storage, can be, you can share with Sanornas, you can comment on the work of the cluster, but also can help you continue to improve the virtual architecture.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.