Hadoop dynamic Join/delete nodes (Datanode and Tacktracker)

Last Update:2015-10-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In general, the correct approach is to prefer the configuration file and then start the detailed machine corresponding to the process/stop operation.

Some of the information on the web said that in the adjustment of the configuration file, the first use of host name instead of IP configuration.

In general, the method of adding/removing Datanode and Tasktracker is very similar, only a small difference between the operation's configuration items and the commands used.

1. DataNode1.0 Configuration FilesChange the configuration file Conf/mapred-site.xml under Master/namenode. Key parameters of Dfs.hosts and Dfs.hosts.exclude.
Note: Configuration file planning for different Hadoop version numbers is not consistent!

The cluster Setup section of the Hadoop official documentation for a detailed reference to the relevant version number. http://hadoop.apache.org/docs/click on the same or similar version number.

The above statement is in Hadoop 1.x, followed by this version number example, the above configuration in Hadoop 0.x is stored in the file Conf/hadoop-site.xml, in the Hadoop 2.x is very large, the file is conf/ Hdfs-site.xml, the parameters are: Dfs.namenode.hosts and fs.namenode.hosts.exclude.
Role: dfs.hosts: agree to access the list of machines for Datanode, assuming that not configured or the specified list file is empty, the default consent to all hosts become DataNodedfs.hosts.exclude: List of machines denied access to Datanode. If a machine is present in two lists at the same time, it is rejected.

Their essential role is to deny Datanode process connections on certain nodes. instead of dispatching the consent and shutdown of the datanode process on these nodes.
Examples of how to use: Change Conf/mapred-site.xml, add:

<property><name>dfs.hosts</name><value>/opt/hadoop/conf/datanode-allow.list</value ></property><property><name>dfs.hosts.exclude</name><value>/opt/hadoop/conf/ Datanode-deny.list</value></property>

If you do not need to agree to the list, do not create the corresponding item.

The file specified by value is then created. Write a host name on one line.

1.1 Join

1, configure on the new slave.

2, add the slave on the slave list on master (not required, convenient to restart later cluster)

3. (if any) add the slave to the Datanode-allow.list

4, start the Datanode process on slave:

Execution:hadoop-daemon.sh start Datanode

PS: Ability to use the JPS command to view the PID and process name of the Java process on the machine.

1.2 Delete Extreme is not recommended directly on the slave by:hadoop-daemon.sh Stop Datanodecommand to turn off Datanode. This causes the missing block to appear in HDFs.

1. Change datanode-deny.list on Master to join the corresponding machine2, refresh the node configuration on master:Hadoop dfsadmin-refreshnodes
At this point in the Web UI you can immediately see that the node becomes decommissioning state, and after a while it becomes dead. can also be viewed by: Hadoop dfsadmin-report command. 3. Close the Datanode process on slave (not required): Execute:hadoop-daemon.sh Stop Datanode

1.2.1 Add each deleted node again1, delete the corresponding machine in master's datanode-deny.list2. To refresh the node configuration on master:Hadoop dfsadmin-refreshnodes
3, restart the Datanode process on slave:hadoop-daemon.sh Start Datanode
PS: Assuming that the datanode process on the slave has not been closed before, you need to shut down and start again.
2. TackTracker2.0 config file under Hadoop 1.x under Master/namenode change profile conf/mapred-site.xml.

Key parameters of Mapred.hosts and Mapred.hosts.exclude.

For Hadoop 0.x need to change the configuration file conf/hadoop-site.xml; For Hadoop 2.x is not clear, not to mention.
Role: The same as the corresponding datanode.
Example of how to use: Change Conf/mapred-site.xml. Join:

<property><name><span style= "font-family:arial, Helvetica, Sans-serif;" >mapred</span><span style= "font-family:arial, Helvetica, Sans-serif;" >.hosts</name></span><value>/opt/hadoop/conf/tasktracker-allow.list</value></ Property><property><name><span style= "font-family:arial, Helvetica, Sans-serif;" >mapred</span>.hosts.exclude</name><value>/opt/hadoop/conf/tasktracker-deny.list</ Value></property>

Suppose you don't need a consent list. Do not create a corresponding item.

The file specified by value is then created.

Write a host name on one line.
2.1 Join

1, configure on the new slave.

2. The slave list on master adds the slave (not required. Convenient for later restart cluster)

3, (if any) add the slave to the Tasktracker-allow.list

4, start the tasktracker process on slave:

Execution:hadoop-daemon.sh start Tasktracker

PS: Ability to use the JPS command to view the PID and process name of the Java process on the machine.

2.2 Delete

It is not recommended to go directly on slave by: hadoop-daemon.sh stop tasktracker command to turn off Tasktracker, which will cause Namenode to feel that these machines are temporarily missing the union. Within a timeout period (default 10min+30s) The task will still be sent to them if they are normal.

1, change the tasktracker-deny.list on master, add the corresponding machine 2. Refresh node configuration on master: Hadoop mradmin-refreshnodes
At this point in the Web UI, you can see that the number of nodes is reduced immediately. And the number of exclude nodes has been added. Be able to click in detail to view.

3, close the Tasktracker process on slave (not required): Execute: hadoop-daemon.sh stop Tasktracker

2.2.1 Add each deleted node again1, delete the corresponding machine in master's tasktracker-deny.list2. To refresh the node configuration on master:Hadoop mradmin-refreshnodes
3. Restart the tasktracker process on slave:hadoop-daemon.sh Start Tasktracker
PS: Assume that the tasktracker process on the slave has not been closed before. You need to shut down and start again.

Originally contained in Http://blog.csdn.net/yanxiangtianji

Reprint please indicate the source

Hadoop dynamic Join/delete nodes (Datanode and Tacktracker)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop dynamic Join/delete nodes (Datanode and Tacktracker)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hadoop dynamic Join/delete nodes (Datanode and Tacktracker)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support