Add new hadoop node practices

Last Update:2018-06-06 Source: Internet

Author: User

Tags hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Now that namenode and datanode1 are available, add the node datanode2 first step: Modify the Host Name of the node to be added hadoop @ datanode1 :~ $ Vimetchostnamedatanode2 Step 2: Modify the host file hadoop @ datanode1 :~ $ Vimetchosts192.168.8.4datanode2127.0.0.1localhost127.0

Now that namenode and datanode1 are available, add the node datanode2 first step: Modify the Host Name of the node to be added hadoop @ datanode1 :~ $ Vim/etc/hostname datanode2 Step 2: Modify the host file hadoop @ datanode1 :~ $ Vim/etc/hosts 192.168.8.4 datanode2 127.0.0.1 localhost127.0

Now that namenode and datanode1 are available, you need to add datanode2.
Step 1: Modify the Host Name of the node to be added
Hadoop @ datanode1 :~ $ Vim/etc/hostname
Datanode2
Step 2: Modify the host file
Hadoop @ datanode1 :~ $ Vim/etc/hosts
192.168.8.4 datanode2
127.0.0.1 localhost
127.0.1.1 ubuntu
192.168.8.2 namenode
192.168.8.3 datanode1
192.168.8.4 datanode2 (added)

# The following lines are desirable for IPv6 capable hosts
: 1 ip6-localhost ip6-loopback
Fe00: 0 ip6-localnet
Ip6-mcastprefix ff00: 0
Ff02: 1 ip6-allnodes
Ff02: 2 ip6-allrouters
Step 3: Modify the ip address

Step 4: restart
Step 5: ssh password-free Configuration
1. Generate a key
Hadoop @ datanode2 :~ $ Ssh-keygen-t rsa-P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/. ssh/id_rsa ):
/Home/hadoop/. ssh/id_rsa already exists.
Overwrite (y/n )? Y
Your identification has been saved in/home/hadoop/. ssh/id_rsa.
Your public key has been saved in/home/hadoop/. ssh/id_rsa.pub.
The key fingerprint is:
34: 45: 84: 85: 6e: f3: 9e: 7a: c0: f1: a4: ef: bf: 30: a6: 74 hadoop @ datanode2
The key's randomart image is:
+ -- [RSA 2048] ---- +
| * = |
| O. |
|. O |
|. =... |
| OSB |
| + O |
|. + E. |
|. + = O |
| O +. o. |
+ ----------------- +
2. Pass the public key to namenode
Hadoop @ datanode2 :~ $ Cd ~ /. Ssh
Hadoop @ datanode2 :~ /. Ssh $ ls
Authorized_keys id_rsa id_rsa.pub known_hosts
Hadoop @ datanode2 :~ /. Ssh $ scp./id_rsa.pub hadoop @ namenode:/home/hadoop
Hadoop @ namenode's password:
Id_rsa.pub 100% 398 0.4KB/s
3. append the public key to authorized_keys.
Hadoop @ namenode :~ /. Ssh $ cat ../id_rsa.pub> authorized_keys
Hadoop @ namenode :~ /. Ssh $ cat authorized_keys
Ssh-rsa Secure + secure/iNi/secure + xTzTRJPquYawK + MPf6 + lnLm89u + bewdBZLdunCKhbCK3 hadoop @ ubuntu3
Ssh-rsa encrypt + encrypt/MIMpPW + UFebt150 + encrypt/decrypt/ZOxDiX7GF + YK7KC7Ayo1kL8VuwP90dqIhpaJmP96zV hadoop @ ubuntu2
Ssh-rsa protocol + FRpigxoIePPHiQc5vi7kabnLSiEv + protocol/Signature + a3mEEBMxBwETUI/6dcmvTxjEe7cy48YPadr5UT0/xgTub/Supervisor/BXiObqkSlrJbLKWTczS8J6SfsKWsSZfOPzL hadoop @ datanode2
4. Pass the public key to its Node
Hadoop @ namenode :~ $ Scp./. ssh/authorized_keys hadoop @ datanode1:/home/hadoop/. ssh/authorized_keys
Authorized_keys 100% 1190 1.2KB/s
Hadoop @ namenode :~ $ Scp./. ssh/authorized_keys hadoop @ datanode2:/home/hadoop/. ssh/authorized_keys
Authorized_keys 100% 1190 1.2KB/s
5. An error

@ WARNING: unprotected private key file! @

Permissions 0644 for '/home/jiangqixiang/. ssh/id_dsa' are too open.

It is recommended that your private key files are NOT accessible by others.

This private key will be ignored.

Bad permissions: ignore key:/home/youraccount/. ssh/id_dsa solution:

Chmod 700 id_rsa

Step 6: Modify the namenode configuration file

Hadoop @ namenode :~ $ Cd hadoop-1.2.1/conf

Hadoop @ namenode :~ /Hadoop-1.2.1/conf $ vim slaves

Datanode1

Datanode2

Step 7: Server Load balancer

Hadoop @ namenode :~ /Hadoop-1.2.1/conf $ start-balancer.sh

Warning: $ HADOOP_HOME is deprecated.

Starting balancer, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-balancer-namenode.out

From other blogs

1) if not balance, the cluster will store new data on the new node, which will Reduce the efficiency of Map Reduce.

2) threshold is the balance threshold. The default value is 10%. The lower the value, the more balanced the nodes, but the longer the consumption time.

/App/hadoop/bin/start-balancer.sh-threshold 0.1

3) The balance bandwidth can be added to the configuration file hdfs-site.xml in namenode (default is 1 M ):

　　 Dfs. balance. bandwidthPerSec

　　 1048576

Specifies the maximum amount of bandwidth that each datanode

Can utilize for the balancing purpose in term

The number of bytes per second.

Step 8: Test Validity

1. Start hadoop

Hadoop @ namenode :~ /Hadoop-1.2.1 $ start-all.sh

Warning: $ HADOOP_HOME is deprecated.

Starting namenode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-namenode.out

Datanode2: starting datanode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode2.out

Datanode1: starting datanode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode1.out

Namenode: starting secondarynamenode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-namenode.out

Starting jobtracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-namenode.out

Datanode2: starting tasktracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode2.out

Datanode1: starting tasktracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode1.out

Hadoop @ namenode :~ Hadoop-1.2.1 $

2. Error

An error occurred while running the wordcount program.

Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop jars hadoop-examples-1.2.1.jar wordcount in out

Warning: $ HADOOP_HOME is deprecated.

14/09/12 08:40:39 ERROR security. UserGroupInformation: PriviledgedActionException as: hadoop cause: org. apache. hadoop. ipc. RemoteException: org. apache. hadoop. mapred. SafeModeException: JobTracker is in safe mode

At org. apache. hadoop. mapred. JobTracker. checkSafeMode (JobTracker. java: 5188)

At org. apache. hadoop. mapred. JobTracker. getStagingAreaDir (JobTracker. java: 3677)

At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)

At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)

At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)

At java. lang. reflect. Method. invoke (Method. java: 606)

At org. apache. hadoop. ipc. RPC $ Server. call (RPC. java: 587)

At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1432)

At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1428)

At java. security. AccessController. doPrivileged (Native Method)

At javax. security. auth. Subject. doAs (Subject. java: 415)

At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)

At org. apache. hadoop. ipc. Server $ Handler. Execute (Server. java: 1426)

Org. apache. hadoop. ipc. RemoteException: org. apache. hadoop. mapred. SafeModeException: JobTracker is in safe mode

At org. apache. hadoop. mapred. JobTracker. checkSafeMode (JobTracker. java: 5188)

At org. apache. hadoop. mapred. JobTracker. getStagingAreaDir (JobTracker. java: 3677)

At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)

At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)

At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)

At java. lang. reflect. Method. invoke (Method. java: 606)

At org. apache. hadoop. ipc. RPC $ Server. call (RPC. java: 587)

At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1432)

At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1428)

At java. security. AccessController. doPrivileged (Native Method)

At javax. security. auth. Subject. doAs (Subject. java: 415)

At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)

At org. apache. hadoop. ipc. Server $ Handler. Execute (Server. java: 1426)

At org. apache. hadoop. ipc. Client. call (Client. java: 1113)

At org. apache. hadoop. ipc. RPC $ Invoker. invoke (RPC. java: 229)

At org. apache. hadoop. mapred. $ Proxy2.getStagingAreaDir (Unknown Source)

At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)

At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)

At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)

At java. lang. reflect. Method. invoke (Method. java: 606)

At org. apache. hadoop. io. retry. RetryInvocationHandler. invokeMethod (RetryInvocationHandler. java: 85)

At org. apache. hadoop. io. retry. RetryInvocationHandler. invoke (RetryInvocationHandler. java: 62)

At org. apache. hadoop. mapred. $ Proxy2.getStagingAreaDir (Unknown Source)

At org. apache. hadoop. mapred. JobClient. getStagingAreaDir (JobClient. java: 1309)

At org. apache. hadoop. mapreduce. JobSubmissionFiles. getStagingDir (JobSubmissionFiles. java: 102)

At org. apache. hadoop. mapred. JobClient $ 2.run( JobClient. java: 942)

At org. apache. hadoop. mapred. JobClient $ 2.run( JobClient. java: 936)

At java. security. AccessController. doPrivileged (Native Method)

At javax. security. auth. Subject. doAs (Subject. java: 415)

At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)

At org. apache. hadoop. mapred. JobClient. submitJobInternal (JobClient. java: 936)

At org. apache. hadoop. mapreduce. Job. submit (Job. java: 550)

At org. apache. hadoop. mapreduce. Job. waitForCompletion (Job. java: 580)

At org. apache. hadoop. examples. WordCount. main (WordCount. java: 82)

At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)

At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)

At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)

At java. lang. reflect. Method. invoke (Method. java: 606)

At org. apache. hadoop. util. ProgramDriver $ ProgramDescription. invoke (ProgramDriver. java: 68)

At org. apache. hadoop. util. ProgramDriver. driver (ProgramDriver. java: 139)

At org. apache. hadoop. examples. ExampleDriver. main (ExampleDriver. java: 64)

At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)

At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)

At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)

At java. lang. reflect. Method. invoke (Method. java: 606)

At org. apache. hadoop. util. RunJar. main (RunJar. java: 160)

Solution:

Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop dfsadmin-safemode leave

Warning: $ HADOOP_HOME is deprecated.

Safe mode is OFF

3. test again

Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop jars hadoop-examples-1.2.1.jar wordcount in out

Warning: $ HADOOP_HOME is deprecated.

14/09/12 08:48:26 INFO input. FileInputFormat: Total input paths to process: 2

14/09/12 08:48:26 INFO util. NativeCodeLoader: Loaded the native-hadoop library

14/09/12 08:48:26 WARN snappy. LoadSnappy: Snappy native library not loaded

14/09/12 08:48:28 INFO mapred. JobClient: Running job: job_201409120827_0003

14/09/12 08:48:29 INFO mapred. JobClient: map 0% reduce 0%

14/09/12 08:48:47 INFO mapred. JobClient: map 50% reduce 0%

14/09/12 08:48:48 INFO mapred. JobClient: map 100% reduce 0%

14/09/12 08:48:57 INFO mapred. JobClient: map 100% reduce 33%

14/09/12 08:48:59 INFO mapred. JobClient: map 100% reduce 100%

14/09/12 08:49:02 INFO mapred. JobClient: Job complete: job_201409120827_0003

14/09/12 08:49:02 INFO mapred. JobClient: Counters: 30

14/09/12 08:49:02 INFO mapred. JobClient: Job Counters

14/09/12 08:49:02 INFO mapred. JobClient: Launched reduce tasks = 1

14/09/12 08:49:02 INFO mapred. JobClient: SLOTS_MILLIS_MAPS = 27285

14/09/12 08:49:02 INFO mapred. JobClient: Total time spent by all CES waiting after reserving slots (MS) = 0

14/09/12 08:49:02 INFO mapred. JobClient: Total time spent by all maps waiting after reserving slots (MS) = 0

14/09/12 08:49:02 INFO mapred. JobClient: Rack-local map tasks = 1

14/09/12 08:49:02 INFO mapred. JobClient: Launched map tasks = 2

14/09/12 08:49:02 INFO mapred. JobClient: Data-local map tasks = 1

14/09/12 08:49:02 INFO mapred. JobClient: SLOTS_MILLIS_REDUCES = 12080

14/09/12 08:49:02 INFO mapred. JobClient: File Output Format Counters

14/09/12 08:49:02 INFO mapred. JobClient: Bytes Written = 48

14/09/12 08:49:02 INFO mapred. JobClient: FileSystemCounters

14/09/12 08:49:02 INFO mapred. JobClient: FILE_BYTES_READ = 104

14/09/12 08:49:02 INFO mapred. JobClient: HDFS_BYTES_READ = 265

14/09/12 08:49:02 INFO mapred. JobClient: FILE_BYTES_WRITTEN = 177680

14/09/12 08:49:02 INFO mapred. JobClient: HDFS_BYTES_WRITTEN = 48

14/09/12 08:49:02 INFO mapred. JobClient: File Input Format Counters

14/09/12 08:49:02 INFO mapred. JobClient: Bytes Read = 45

14/09/12 08:49:02 INFO mapred. JobClient: Map-Reduce Framework

14/09/12 08:49:02 INFO mapred. JobClient: Map output materialized bytes = 110

14/09/12 08:49:02 INFO mapred. JobClient: Map input records = 2

14/09/12 08:49:02 INFO mapred. JobClient: Reduce shuffle bytes = 110

14/09/12 08:49:02 INFO mapred. JobClient: Spilled Records = 18

14/09/12 08:49:02 INFO mapred. JobClient: Map output bytes = 80

14/09/12 08:49:02 INFO mapred. JobClient: Total committed heap usage (bytes) = 248127488

14/09/12 08:49:02 INFO mapred. JobClient: CPU time spent (MS) = 8560

14/09/12 08:49:02 INFO mapred. JobClient: Combine input records = 9

14/09/12 08:49:02 INFO mapred. JobClient: SPLIT_RAW_BYTES = 220

14/09/12 08:49:02 INFO mapred. JobClient: Reduce input records = 9

14/09/12 08:49:02 INFO mapred. JobClient: Reduce input groups = 7

14/09/12 08:49:02 INFO mapred. JobClient: Combine output records = 9

14/09/12 08:49:02 INFO mapred. JobClient: Physical memory (bytes) snapshot = 322252800

14/09/12 08:49:02 INFO mapred. JobClient: Reduce output records = 7

14/09/12 08:49:02 INFO mapred. JobClient: Virtual memory (bytes) snapshot = 1042149376

14/09/12 08:49:02 INFO mapred. JobClient: Map output records = 9

Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop fs-cat out /*

Warning: $ HADOOP_HOME is deprecated.

Heheh 1

Hello 2

It's 1

Ll 1

The 2

Think 1

Why 1

Cat: File does not exist:/user/hadoop/out/_ logs

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More