Hadoop Tutorial (12) HDFs Add delete nodes and perform cluster balancing

Source: Internet
Author: User

HDFs Add Delete nodes and perform HDFs balance

Mode 1: Static add Datanode, stop Namenode mode

1. Stop Namenode

2. Modify the slaves file and update to each node

3. Start Namenode

4. Execute the Hadoop balance command. (This is used for the balance cluster and is not required if you are just adding a node)


Mode 2: Dynamically add Datanode, keep Namenode way

1. Modify the slaves file, add the additional node host or IP, and update it to each node

2. Start the Execute start datanode command in Datanode. Command: SH hadoop-daemon.sh start Datanode

3. You can view the node additions through the Web interface. or use the command: SH hadoop dfsadmin-report

4. Execute the Hadoop balance command. (This is used for the balance cluster and is not required if you are just adding a node)


For 4th, start-balancer.sh can perform-threshold parameters.

The-threshold parameter is a threshold value that specifies the balance.

The default for-threshold is 10, that is, the actual HDFs storage usage per datanode node/cluster HDFS reserves


Datanode HDFs Use quantity 1.2G;

The cluster total HDFs storage 10T is 10000G;

Then the t value is 1.2/10000 = 0.00012;

When the-t parameter of the execution balance is less than 0.00012, the cluster is balance;

Command is: Start-balancer.sh-threshold 0.0001


1. The balance command may be activated on Namenode or datanode;

You can stop the balance command at any time.

The default bandwidth for balance is 1m/s.

2. The slave file is used when rebooting. The start and stop of the cluster need to read the slave file.

When you enable Datanode, you can push the information to Namenode as long as the Namenode location is configured in Hdfs-site.

View the Namenode HTTP management interface to view the node additions.



HDFS Delete Node

Mode 1: Through the Dead Way (Namenode):

1. SH Hadoop dfsadmin-refreshserviceacl

Note: The dead method does not modify slave files and hdfs-site files.

So when the cluster restarts, the node is not added to the Namenode management.

This is done on the Namenode, other nodes can be tested separately. , the command resets the node state to dead.

See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/

Mode 2: Through the decommission way:

A) Modify the Hdfs-site to add the excluded node in the Exclude field.

b Execute SH hadoop dfsadmin-refreshnodes, force refresh.

C to view the state of the node, which has a state of decommission.

Description: The decommission method modifies the Hdfs-site file without modifying the slave file.

So when the cluster reboots, the node will be started as datanode, but because exclude is added, Namenode will place the node as decommission.

At this point, Namenode does not HDFS related traffic to the node. Also that exclude played a role in the firewall.


1. If you stop Datanode on a single node, the Datanode information for that node will still appear in the Namenode statistics.

At this point, the machine can be dead or decommission (retired).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.