Cdh4 installation and deployment Series 3-server Planning

Source: Internet
Author: User

1 namenode planning description:

Because namenode is very important, once data is lost or the work is stopped, the entire cluster cannot be restored. Therefore, namenode is installed on a single server and deployed separately. Zkfc is responsible for monitoring namenode status information, so zkfc must be installed on each namenode.

 

2 journalnode planning description:

Because journalnode consumes less system resources, journalnode can be set up with other services. Journalnode must have at least three nodes. Of course, you can run more. It is best to deploy an odd number of nodes, but an even number is not acceptable. However, the journalnode cluster will crash the entire cluster only when the number of nodes goes over half. Therefore, the odd number is better. Such as 3, 5, 7, and 9. The more deployment, the higher the reliability.

 

3 datanode planning description:

In addition to namenode and journalnode nodes, datanode is installed on all other nodes in the cluster. Because datanode is responsible for database storage and read/write, the more datanode, the higher the cluster Efficiency

 

4 yarn (ResourceManager + nodemanager + mapreduce) Planning description:

ResourceManager:

Each cluster has a resource manager. ResourceManager is responsible for job and resource scheduling. Receives jobs submitted by jobsubmitter, starts the scheduling process according to the context information of the job, and the status information collected from nodemanager, and assigns a container as the app mstr.

Therefore, ResourceManager has a heavy workload and requires a large amount of system resources. We recommend that you deploy ResourceManager on a single server.

 

Nodemanager and mapreduce:

Maintains the container status and keeps the RM heartbeat. Each slave node runs a nodemanager to monitor and manage resource usage on the node. Similar to mrv1 when running a job, each slave node runs map and/or reduce tasks. For each job (application), an application master (running on a server Load balancer node) is responsible for managing the application lifecycle and applying for resources from the resource manager, and monitors the status of tasks (such as restarting failed tasks ).

Therefore, each datanode node runs a nodemanager and a mapreduce

 

5 zookeeper planning description:

Considering that there are not many resources required by the zookeeper cluster, we generally recommend that you deploy ZK nodes and other services on the same machine. Zookeeper must have at least three nodes. Of course, you can run more. It is best to deploy an odd number and an even number, but the zookeeper cluster will crash the entire cluster only when the number of downtime exceeds half. Therefore, the odd number is better. Such as 3, 5, 7, and 9. The more deployment, the higher the reliability.


This article is from the "Swallow swallow test column" blog, please be sure to keep this source http://bobbleyan.blog.51cto.com/9111528/1553520

Cdh4 installation and deployment Series 3-server Planning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.