Hadoop cluster hardware standard configuration

Source: Internet
Author: User

Hadoop cluster hardware standard configuration

 

When selecting hardware, we often need to consider the performance and expenditure of applications. To this end, we must find a perfect balance between meeting actual needs and being economically feasible. The following uses the Hadoop cluster application as an example to describe the elements to consider.

 

1. I/O binding workload.

 

Hadoop involves the following I/O operations:

  • Index)
  • Grouping)
  • Data Import and Export (Data import and exporting)
  • Data movement and transformation)

 

2. CPU-bound workloads

 

In Hadoop, the execution of jobs requires the continuous operation of the CPU.

The following lists several aspects:

 

  • Cluster processing (Clustering/Classification)
  • Complex text mining
  • Natural language processing (Natural-language processing)
  • Special Feature extraction)

 

Now, let's explain the Hadoop standard configuration directly. Enterprise-level applications and private cloud processing may all involve such Configuration Requirements. The following lists the standard configuration specifications of DataNode/TaskTrackers (MR1:

 

  • 12-24 1-4 TB hard drive (batch disks)
  • 2 quad-/hex-/octo-core CPUs, running at least 2-2.5 GHz
  • 64-512 GB of RAM (memory)

 

Bind Gigabit Ethernet (Gigabit network, more storage density, requiring higher network input)

 

Standard configuration specifications for NameNode/JobTracker (MR1) in a Hadoop cluster:

 

  • 3-6 1 TB hard disks (batch disks)
  • 2 quad-/hex-/octo-core CPUs, running at least 2-2.5 GHz
  • 64-512 GB of RAM (memory)

 

Bind Gigabit Ethernet (Gigabit network, more storage density, requiring higher network input)

 

In fact, the hardware configuration is divided according to the business, and the parameters are slightly different. From the configuration above, we can see that the hard disk capacity required for data storage is relatively large. It can be seen that it is necessary to consider actual business needs. It may be a little smaller, and the configuration is slightly weaker.

The preceding parameters are for reference only.

-------------------------------------- Split line --------------------------------------

Install Hadoop 0.20.2 in CentOS 6.4

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

-------------------------------------- Split line --------------------------------------

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.