How large data will affect the fate of the datacenter

Source: Internet
Author: User
Keywords Large data can server data center

Large data requires a large amount of computational resources to store, organize, and report results. This emerging sector has changed the way data center servers and other infrastructure are chosen and deployed.

In order to compete and succeed in today's business environment, companies have to make business decisions by multidimensional analysis of existing data. Analyzing these ballooning data has become an increasingly important trend and opportunity.

The Searchdatacenter Advisory Committee will introduce business models affected by large data, how to change the operation of enterprise data centers, and provide a unique insight into the opportunities for new data centers.

Just add a SAN

Independent trainer to ask Sander van Vugt

Big data is not really a serious problem. I mean, data centers don't suddenly change the way they handle massive amounts of data because of the arrival of big data.

My view is quite simple: just add another storage area network (SAN), and now the SAN is more scalable than it was earlier. This means that companies can begin to learn to handle data from two different tiers of storage networks: One is the key data they are using, one that is still needed to be saved, but not so important.

Business applications bring more and more big data opportunities

Clive Longbottom, founder and director of IT research and service, Quocirca of IT research and analysis company

We are still at the starting line of real enterprise-class data, and the road is long.

Data centers now use storage virtualization to organize federated data sources. Business Intelligence (BI) provides more advanced large data processing solutions, such as Pentaho, Logi, Qliktech and Birst. Java-based programming framework Hadoop is used by more advanced enterprises as a non-persistent filter to handle multiple data types. NoSQL databases, such as MongoDB and couchbase, are an effective tool for dealing with unstructured data. The management tool has splunk, can assist to complete the data file management between the server and so on work.

These tools need to be supported by their own infrastructure and need to be carefully designed to achieve the desired results. Analytics and service providers are emerging to provide BI and cloud computing capabilities-and many organizations will eventually move in this direction to avoid the complexities of mixed environments. IBM, Teradata, EMC and other vendors offer hybrid devices to meet business requirements, which allow users to retain all of their online data and absorb additional information from external sources. The mixed-device processing architecture deals with unstructured data in a way that is more engineered than the current large data structure, but is also costly.

Select servers, storage, and schemas

  

Advanced Technical editor Stephen J. Bigelow

Select tools for data analysis, such as Hadoop and MapReduce software, that can distribute tasks to thousands of nodes (processors) for calculation and collect results.

The highly scalable tasks used by software are fundamentally different from traditional single-threaded execution, which means that large servers have the largest and strongest computing power. It can be assumed that large servers also have the most processor cores, such as Intel's Xeon E7-8800 V2 processor, which has 15 cores and supports Hyper-threading. Data centers can solve large data processing problems by buying these servers.

The compact instruction set processor is another option for many large data servers, which can provide a large number of processor cores and produce much less heat than traditional x86 processors. Dell has developed a Bharat server based on Calxeda arm chips to support enterprise applications.

Although more processors require additional memory space to process and store results, and large data is more focused on computing tasks, the total memory of the server can be very large, or even larger than hundreds of G. For example, HP's Convergedsystem Vertica Analytics platform has 128G of RAM, and IBM's System x Reference Architecture for Hadoop requires the server to have 384G of memory.

Large data servers can also integrate graphics processing units, such as Nvidia's Tesla K40, because the GPU is designed to handle complex mathematical computations, such as double-precision floating-point computations that can reach 1.4T flops (one tflops (teraflops) equals one trillion per second (= 1012) times of floating-point operations. A large number of mathematical computations can be unloaded from multiple processors onto a single GPU without additional system memory.

Any large data platform must consider the infrastructure, such as network and storage, when evaluating it. Multi-port network adapters can help distribute workloads between servers. Upgrading from Gigabit Ethernet to Gigabit Ethernet enables higher utilization in large data environments. You must also have a sufficient number of switch ports (gigabit or Gigabit Ethernet) to meet the connectivity requirements of all server ports. In addition, it architects can consider allocating ports on each server to different switches to build a more powerful and usable environment. Data centers may need to provide more budget for newer models of network switches.

Hadoop and other large data applications typically promote performance by using local storage and stand-alone processors rather than shared storage. You can minimize disk latency by assigning disk tasks to separate runs on many disks. You can also consider using solid-state drives to replace traditional mechanical hard drives, and even faster, PCIe-based solid-state hard drive accelerator cards to improve performance.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.