Hadoop dynamically add and delete nodes datanode and restore

Source: Internet
Author: User

Hadoop dynamically add and delete nodes datanode and restore

1. Configure the system environment

Host Name, ssh mutual trust, environment variables, etc.

This article omitted jdk installation, please ensure that the jdk installation path of datanode is consistent with java_home in/etc/Hadoop/hadoop-evn.sh, version hadoop2.7.5

Modify/etc/sysconfig/network

Then execute the command
Hostname Host Name
You can log out of the system at this time, and then log on again.

[Root @ localhost ~] # Hostname
Localhost. localdomain
[Root @ localhost ~] # Hostname-I
: 1 127.0.0.1
[Root @ localhost ~] #
[Root @ localhost ~] # Cat/etc/sysconfig/network
# Created by anaconda
NETWORKING = yes
HOSTNAME = slave2
GATEWAY = 192.168.48.2
# Oracle-rdbms-server-11gR2-preinstall: Add NOZEROCONF = yes
NOZEROCONF = yes
[Root @ localhost ~] # Hostname slave2
[Root @ localhost ~] # Hostname
Slave2
[Root @ localhost ~] # Su-hadoop
Last login: Sat Feb 24 14:25:48 CST 2018 on pts/1
[Hadoop @ slave2 ~] $ Su-root

Create the datanode directory and change the owner

(For specific path values here, see/usr/hadoop/hadoop-2.7.5/etc/hadoop/hdfs-site.xml in namenode, dfs in core-site.xml. name. dir, dfs. data. dir, dfs. tmp. dir)

Su-root

# Mkdir-p/usr/local/The hadoop-2.7.5/tmp/dfs/data

# Chmod-R 777/usr/local/hadoop-2.7.5/tmp

# Chown-R hadoop: hadoop/usr/local/hadoop-2.7.5

[Root @ slave2 ~] # Mkdir-p/usr/local/The hadoop-2.7.5/tmp/dfs/data
[Root @ slave2 ~] # Chmod-R 777/usr/local/hadoop-2.7.5/tmp
[Root @ slave2 ~] # Chown-R hadoop: hadoop/usr/local/hadoop-2.7.5
[Root @ slave2 ~] # Pwd
/Root
[Root @ slave2 ~] # Cd/usr/local/
[Root @ slave2 local] # ll
Total 0
Drwxr-xr-x. 2 root 46 Mar 21 2017 bin
Drwxr-xr-x. 2 root 6 Jun 10 2014 etc.
Drwxr-xr-x. 2 root 6 Jun 10 2014 games
Drwxr-xr-x 3 hadoop 16 Feb 24 hadoop-2.7.5
Drwxr-xr-x. 2 root 6 Jun 10 2014 include
Drwxr-xr-x. 2 root 6 Jun 10 2014 lib
Drwxr-xr-x. 2 root 6 Jun 10 2014 lib64
Drwxr-xr-x. 2 root 6 Jun 10 2014 libexec
Drwxr-xr-x. 2 root 6 Jun 10 2014 sbin
Drwxr-xr-x. 5 root 46 Dec 17 2015 share
Drwxr-xr-x. 2 root 6 Jun 10 2014 src
[Root @ slave2 local] #

Ssh mutual trust: Implements master --> slave2 password-free

Master:

[Root @ hadoop-master ~] # Cat/etc/hosts

127.0.0.1 localhost. localdomain localhost4 localhost4.localdomain4

: 1 localhost. localdomain localhost6 localhost6.localdomain6

192.168.48.129 hadoop-master

192.168.48.132 slave1

192.168.48.131 slave2

[Hadoop @ hadoop-master ~] $ Scp/usr/hadoop/. ssh/authorized_keys hadoop @ slave2:/usr/hadoop/. ssh

The authenticity of host 'slave2 (192.168.48.131) 'can't be established.

ECDSA key fingerprint is 1e: cd: d1: 3d: b0: 5b: 62: 45: a3: 63: df: c7: 7a: 0f: b8: 7c.

Are you sure you want to continue connecting (yes/no )? Yes

Warning: Permanently added 'slave2, 192.168.48.131' (ECDSA) to the list of known hosts.

Hadoop @ slave2's password:

Authorized_keys

[Hadoop @ hadoop-master ~] $ Ssh hadoop @ slave2

Last login: Sat Feb 24 18:27:33 2018

[Hadoop @ slave2 ~] $

[Hadoop @ slave2 ~] $ Exit

Logout

Connection to slave2 closed.

[Hadoop @ hadoop-master ~] $

2. Modify the slave file of the namenode node and add new node information.

[Hadoop @ hadoop-master hadoop] $ pwd

/Usr/hadoop/hadoop-2.7.5/etc/hadoop

[Hadoop @ hadoop-master hadoop] $ vi slaves

Slave1

Slave2

3. On the namenode node, copy the hadoop-2.7.3 to the new node and delete files in the data and logs directories on the new node

Master

[Hadoop @ hadoop-master ~] $ Scp-R hadoop-2.7.5 hadoop @ slave2:/usr/hadoop

Slave2

[Hadoop @ slave2 hadoop-2.7.5] $ ll

Total 124

Drwxr-xr-x 2 hadoop 4096 Feb 24 bin

Drwxr-xr-x 3 hadoop 19 Feb 24 14:30 etc

Drwxr-xr-x 2 hadoop 101 Feb 24 include

Drwxr-xr-x 3 hadoop 19 Feb 24 14:29 lib

Drwxr-xr-x 2 hadoop 4096 Feb 24 libexec

-Rw-r -- 1 hadoop 86424 Feb 24 LICENSE.txt

Drwxrwxr-x 2 hadoop 4096 Feb 24 logs

-Rw-r -- 1 hadoop 14978 Feb 24 NOTICE.txt

-Rw-r -- 1 hadoop 1366 Feb 24 README.txt

Drwxr-xr-x 2 hadoop 4096 Feb 24 sbin

Drwxr-xr-x 4 hadoop 29 Feb 24 share

[Hadoop @ slave2 hadoop-2.7.5] $ pwd

// Usr/hadoop/hadoop-2.7.5

[Hadoop @ slave2 hadoop-2.7.5] $ rm-R logs /*

4. Start the datanode and nodemanger processes of the new datanode.

First, check that there is no host to be added in the etc/hoadoop/excludes file in the namenode and current datanode, and then perform the following operations:

[Hadoop @ slave2 hadoop-2.7.5] $ sbin/hadoop-daemon.sh start datanode
Starting datanode, logging to/usr/hadoop/hadoop-2.7.5/logs/hadoop-hadoop-datanode-slave2.out
[Hadoop @ slave2 hadoop-2.7.5] $ sbin/yarn-daemon.sh start nodemanager
Starting datanode, logging to/usr/hadoop/hadoop-2.7.5/logs/yarn-hadoop-datanode-slave2.out
[Hadoop @ slave2 hadoop-2.7.5] $
[Hadoop @ slave2 hadoop-2.7.5] $ jps
3897 DataNode
6772 NodeManager
Jps 8189
[Hadoop @ slave2 ~] $

5. Refresh the node on NameNode

[Hadoop @ hadoop-master ~] $ Hdfs dfsadmin-refreshNodes
Refresh nodes successful
[Hadoop @ hadoop-master ~] $ Sbin/start-balancer.sh

6. view the current cluster status in namenode,

The confirmation node has been added.

[Hadoop @ hadoop-master hadoop] $ hdfs dfsadmin-report
Configured Capacity: 58663657472 (54.63 GB)
Present Capacity: 15487176704 (14.42 GB)
DFS Remaining: 15486873600 (14.42 GB)
DFS Used: 303104 (296 KB)
DFS Used %: 0.00%
Under replicated blocks: 5
Blocks with primary upt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (2 ):

Name: 192.168.48.131: 50010 (slave2)
Hostname: 183.221.250.11
Decommission Status: Normal
Configured Capacity: 38588669952 (35.94 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 36887191552 (34.35 GB)
DFS Remaining: 1701470208 (1.58 GB)
DFS Used %: 0.00%
DFS Remaining %: 4.41%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used %: 100.00%
Cache Remaining %: 0.00%
Xceivers: 1
Last contact: Thu Mar 01 19:36:33 PST 2018

Name: 192.168.48.132: 50010 (slave1)
Hostname: slave1
Decommission Status: Normal
Configured Capacity: 20074987520 (18.70 GB)
DFS Used: 294912 (288 KB)
Non DFS Used: 6289289216 (5.86 GB)
DFS Remaining: 13785403392 (12.84 GB)
DFS Used %: 0.00%
DFS Remaining %: 68.67%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used %: 100.00%
Cache Remaining %: 0.00%
Xceivers: 1
Last contact: Thu Mar 01 19:36:35 PST 2018

[Hadoop @ hadoop-master hadoop] $

7. dynamically Delete datanode

7.1 configure the hdfs-site.xml for NameNode,

Appropriately reduce the number of dfs. replication replicas and add the dfs. hosts. exclude configuration.

[Hadoop @ hadoop-master hadoop] $ pwd
/Usr/hadoop/hadoop-2.7.5/etc/hadoop
[Hadoop @ hadoop-master hadoop] $ cat hdfs-site.xml
<Configuration>
<Property>
<Name> dfs. replication </name>
<Value> 3 </value>
</Property>
<Property>
<Name> dfs. name. dir </name>
<Value>/usr/local/hadoop-2.7.5/tmp/dfs/name </value>
</Property>
<Property>
<Name> dfs. data. dir </name>
<Value>/usr/local/hadoop-2.7.5/tmp/dfs/data </value>
</Property>
<Property>
<Name> dfs. hosts. exclude </name>
<Value>/usr/hadoop/hadoop-2.7.5/etc/hadoop/excludes </value>
</Property>

</Configuration>

7.2 create an excludes file in the corresponding namenode path (/etc/hadoop,

And write the ip address or domain name of the DataNode to be deleted.

[Hadoop @ hadoop-master hadoop] $ pwd
/Usr/hadoop/hadoop-2.7.5/etc/hadoop
[Hadoop @ hadoop-master hadoop] $ vi excludes
#### Slave2
192.168.48.131 [hadoop @ hadoop-master hadoop] $

7.3 refresh all DataNode on NameNode

Hdfs dfsadmin-refreshNodes
Sbin/start-balancer.sh

7.4 check the current cluster status in namenode,

The confirmation email node has been deleted normally, and no slave2 exists in the result.

[Hadoop @ hadoop-master hadoop] $ hdfs dfsadmin-report

Alternatively, you can see that DataNode gradually becomes Dead on the web detection interface (ip: 50070.

Http: // 192.168.48.129: 50070/

In the datanode entry, if the Admin state has changed from "In Service" to "Decommissioned", the deletion is successful.

7.5 stop deleted node-related processes

[Hadoop @ slave2 hadoop-2.7.5] $ jps
Jps 9530
3897 DataNode
6772 NodeManager
[Hadoop @ slave2 hadoop-2.7.5] $ sbin/hadoop-daemon.sh stop datanode
Stopping datanode
[Hadoop @ slave2 hadoop-2.7.5] $ sbin/yarn-daemon.sh stop nodemanager
Stopping nodemanager
[Hadoop @ slave2 hadoop-2.7.5] $ jps
Jps 9657
[Hadoop @ slave2 hadoop-2.7.5] $

8. Restore deleted nodes

Execute "delete related information" in "7.2", and then click "4", "5", and "6.

Hadoop2.3-HA high availability cluster environment build https://www.bkjia.com/Linux/2017-03/142155.htm
Hadoop project-Cloudera 5.10.1 (CDH) installation and deployment https://www.bkjia.com/Linux/2017-04/143095.htm Based on CentOS7
Hadoop2.7.2 cluster construction (high availability) https://www.bkjia.com/Linux/2017-03/142052.htm
Use Ambari to deploy a Hadoop cluster (build an intranet HDP source) https://www.bkjia.com/Linux/2017-03/142136.htm
Ubuntu 14.04 Hadoop cluster installation https://www.bkjia.com/Linux/2017-02/140783.htm
CentOS 6.7 installing Hadoop 2.7.2 https://www.bkjia.com/Linux/2017-08/146232.htm
Build a distributed Hadoop-2.7.3 cluster https://www.bkjia.com/Linux/2017-07/145503.htm on Ubuntu 16.04
CentOS 7 Hadoop 2.6.4 distributed Cluster Environment Building https://www.bkjia.com/Linux/2017-06/144932.htm
Hadoop2.7.3 + Spark2.1.0 https://www.bkjia.com/Linux/2017-06/144926.htm for fully distributed cluster building process

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.