Now that namenode and datanode1 are available, add the node datanode2 first step: Modify the Host Name of the node to be added hadoop @ datanode1 :~ $ Vimetchostnamedatanode2 Step 2: Modify the host file hadoop @ datanode1 :~ $ Vimetchosts192.168.8.4datanode2127.0.0.1localhost127.0
Now that namenode and datanode1 are available, add the node datanode2 first step: Modify the Host Name of the node to be added hadoop @ datanode1 :~ $ Vim/etc/hostname datanode2 Step 2: Modify the host file hadoop @ datanode1 :~ $ Vim/etc/hosts 192.168.8.4 datanode2 127.0.0.1 localhost127.0
Now that namenode and datanode1 are available, you need to add datanode2.
Step 1: Modify the Host Name of the node to be added
Hadoop @ datanode1 :~ $ Vim/etc/hostname
Datanode2
Step 2: Modify the host file
Hadoop @ datanode1 :~ $ Vim/etc/hosts
192.168.8.4 datanode2
127.0.0.1 localhost
127.0.1.1 ubuntu
192.168.8.2 namenode
192.168.8.3 datanode1
192.168.8.4 datanode2 (added)
# The following lines are desirable for IPv6 capable hosts
: 1 ip6-localhost ip6-loopback
Fe00: 0 ip6-localnet
Ip6-mcastprefix ff00: 0
Ff02: 1 ip6-allnodes
Ff02: 2 ip6-allrouters
Step 3: Modify the ip address
Step 4: restart
Step 5: ssh password-free Configuration
1. Generate a key
Hadoop @ datanode2 :~ $ Ssh-keygen-t rsa-P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/. ssh/id_rsa ):
/Home/hadoop/. ssh/id_rsa already exists.
Overwrite (y/n )? Y
Your identification has been saved in/home/hadoop/. ssh/id_rsa.
Your public key has been saved in/home/hadoop/. ssh/id_rsa.pub.
The key fingerprint is:
34: 45: 84: 85: 6e: f3: 9e: 7a: c0: f1: a4: ef: bf: 30: a6: 74 hadoop @ datanode2
The key's randomart image is:
+ -- [RSA 2048] ---- +
| * = |
| O. |
|. O |
|. =... |
| OSB |
| + O |
|. + E. |
|. + = O |
| O +. o. |
+ ----------------- +
2. Pass the public key to namenode
Hadoop @ datanode2 :~ $ Cd ~ /. Ssh
Hadoop @ datanode2 :~ /. Ssh $ ls
Authorized_keys id_rsa id_rsa.pub known_hosts
Hadoop @ datanode2 :~ /. Ssh $ scp./id_rsa.pub hadoop @ namenode:/home/hadoop
Hadoop @ namenode's password:
Id_rsa.pub 100% 398 0.4KB/s
3. append the public key to authorized_keys.
Hadoop @ namenode :~ /. Ssh $ cat ../id_rsa.pub> authorized_keys
Hadoop @ namenode :~ /. Ssh $ cat authorized_keys
Ssh-rsa Secure + secure/iNi/secure + xTzTRJPquYawK + MPf6 + lnLm89u + bewdBZLdunCKhbCK3 hadoop @ ubuntu3
Ssh-rsa encrypt + encrypt/MIMpPW + UFebt150 + encrypt/decrypt/ZOxDiX7GF + YK7KC7Ayo1kL8VuwP90dqIhpaJmP96zV hadoop @ ubuntu2
Ssh-rsa protocol + FRpigxoIePPHiQc5vi7kabnLSiEv + protocol/Signature + a3mEEBMxBwETUI/6dcmvTxjEe7cy48YPadr5UT0/xgTub/Supervisor/BXiObqkSlrJbLKWTczS8J6SfsKWsSZfOPzL hadoop @ datanode2
4. Pass the public key to its Node
Hadoop @ namenode :~ $ Scp./. ssh/authorized_keys hadoop @ datanode1:/home/hadoop/. ssh/authorized_keys
Authorized_keys 100% 1190 1.2KB/s
Hadoop @ namenode :~ $ Scp./. ssh/authorized_keys hadoop @ datanode2:/home/hadoop/. ssh/authorized_keys
Authorized_keys 100% 1190 1.2KB/s
5. An error
@ WARNING: unprotected private key file! @
Permissions 0644 for '/home/jiangqixiang/. ssh/id_dsa' are too open.
It is recommended that your private key files are NOT accessible by others.
This private key will be ignored.
Bad permissions: ignore key:/home/youraccount/. ssh/id_dsa solution:
Chmod 700 id_rsa
Step 6: Modify the namenode configuration file
Hadoop @ namenode :~ $ Cd hadoop-1.2.1/conf
Hadoop @ namenode :~ /Hadoop-1.2.1/conf $ vim slaves
Datanode1
Datanode2
Step 7: Server Load balancer
Hadoop @ namenode :~ /Hadoop-1.2.1/conf $ start-balancer.sh
Warning: $ HADOOP_HOME is deprecated.
Starting balancer, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-balancer-namenode.out
From other blogs
1) if not balance, the cluster will store new data on the new node, which will Reduce the efficiency of Map Reduce.
2) threshold is the balance threshold. The default value is 10%. The lower the value, the more balanced the nodes, but the longer the consumption time.
/App/hadoop/bin/start-balancer.sh-threshold 0.1
3) The balance bandwidth can be added to the configuration file hdfs-site.xml in namenode (default is 1 M ):
Dfs. balance. bandwidthPerSec
1048576
Specifies the maximum amount of bandwidth that each datanode
Can utilize for the balancing purpose in term
The number of bytes per second.
Step 8: Test Validity
1. Start hadoop
Hadoop @ namenode :~ /Hadoop-1.2.1 $ start-all.sh
Warning: $ HADOOP_HOME is deprecated.
Starting namenode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-namenode.out
Datanode2: starting datanode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode2.out
Datanode1: starting datanode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode1.out
Namenode: starting secondarynamenode, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-namenode.out
Starting jobtracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-namenode.out
Datanode2: starting tasktracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode2.out
Datanode1: starting tasktracker, logging to/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode1.out
Hadoop @ namenode :~ Hadoop-1.2.1 $
2. Error
An error occurred while running the wordcount program.
Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop jars hadoop-examples-1.2.1.jar wordcount in out
Warning: $ HADOOP_HOME is deprecated.
14/09/12 08:40:39 ERROR security. UserGroupInformation: PriviledgedActionException as: hadoop cause: org. apache. hadoop. ipc. RemoteException: org. apache. hadoop. mapred. SafeModeException: JobTracker is in safe mode
At org. apache. hadoop. mapred. JobTracker. checkSafeMode (JobTracker. java: 5188)
At org. apache. hadoop. mapred. JobTracker. getStagingAreaDir (JobTracker. java: 3677)
At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)
At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)
At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)
At java. lang. reflect. Method. invoke (Method. java: 606)
At org. apache. hadoop. ipc. RPC $ Server. call (RPC. java: 587)
At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1432)
At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1428)
At java. security. AccessController. doPrivileged (Native Method)
At javax. security. auth. Subject. doAs (Subject. java: 415)
At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)
At org. apache. hadoop. ipc. Server $ Handler. Execute (Server. java: 1426)
Org. apache. hadoop. ipc. RemoteException: org. apache. hadoop. mapred. SafeModeException: JobTracker is in safe mode
At org. apache. hadoop. mapred. JobTracker. checkSafeMode (JobTracker. java: 5188)
At org. apache. hadoop. mapred. JobTracker. getStagingAreaDir (JobTracker. java: 3677)
At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)
At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)
At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)
At java. lang. reflect. Method. invoke (Method. java: 606)
At org. apache. hadoop. ipc. RPC $ Server. call (RPC. java: 587)
At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1432)
At org. apache. hadoop. ipc. Server $ Handler $ 1.run( Server. java: 1428)
At java. security. AccessController. doPrivileged (Native Method)
At javax. security. auth. Subject. doAs (Subject. java: 415)
At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)
At org. apache. hadoop. ipc. Server $ Handler. Execute (Server. java: 1426)
At org. apache. hadoop. ipc. Client. call (Client. java: 1113)
At org. apache. hadoop. ipc. RPC $ Invoker. invoke (RPC. java: 229)
At org. apache. hadoop. mapred. $ Proxy2.getStagingAreaDir (Unknown Source)
At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)
At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)
At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)
At java. lang. reflect. Method. invoke (Method. java: 606)
At org. apache. hadoop. io. retry. RetryInvocationHandler. invokeMethod (RetryInvocationHandler. java: 85)
At org. apache. hadoop. io. retry. RetryInvocationHandler. invoke (RetryInvocationHandler. java: 62)
At org. apache. hadoop. mapred. $ Proxy2.getStagingAreaDir (Unknown Source)
At org. apache. hadoop. mapred. JobClient. getStagingAreaDir (JobClient. java: 1309)
At org. apache. hadoop. mapreduce. JobSubmissionFiles. getStagingDir (JobSubmissionFiles. java: 102)
At org. apache. hadoop. mapred. JobClient $ 2.run( JobClient. java: 942)
At org. apache. hadoop. mapred. JobClient $ 2.run( JobClient. java: 936)
At java. security. AccessController. doPrivileged (Native Method)
At javax. security. auth. Subject. doAs (Subject. java: 415)
At org. apache. hadoop. security. UserGroupInformation. doAs (UserGroupInformation. java: 1190)
At org. apache. hadoop. mapred. JobClient. submitJobInternal (JobClient. java: 936)
At org. apache. hadoop. mapreduce. Job. submit (Job. java: 550)
At org. apache. hadoop. mapreduce. Job. waitForCompletion (Job. java: 580)
At org. apache. hadoop. examples. WordCount. main (WordCount. java: 82)
At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)
At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)
At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)
At java. lang. reflect. Method. invoke (Method. java: 606)
At org. apache. hadoop. util. ProgramDriver $ ProgramDescription. invoke (ProgramDriver. java: 68)
At org. apache. hadoop. util. ProgramDriver. driver (ProgramDriver. java: 139)
At org. apache. hadoop. examples. ExampleDriver. main (ExampleDriver. java: 64)
At sun. reflect. NativeMethodAccessorImpl. invoke0 (Native Method)
At sun. reflect. NativeMethodAccessorImpl. invoke (NativeMethodAccessorImpl. java: 57)
At sun. reflect. DelegatingMethodAccessorImpl. invoke (DelegatingMethodAccessorImpl. java: 43)
At java. lang. reflect. Method. invoke (Method. java: 606)
At org. apache. hadoop. util. RunJar. main (RunJar. java: 160)
Solution:
Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop dfsadmin-safemode leave
Warning: $ HADOOP_HOME is deprecated.
Safe mode is OFF
3. test again
Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop jars hadoop-examples-1.2.1.jar wordcount in out
Warning: $ HADOOP_HOME is deprecated.
14/09/12 08:48:26 INFO input. FileInputFormat: Total input paths to process: 2
14/09/12 08:48:26 INFO util. NativeCodeLoader: Loaded the native-hadoop library
14/09/12 08:48:26 WARN snappy. LoadSnappy: Snappy native library not loaded
14/09/12 08:48:28 INFO mapred. JobClient: Running job: job_201409120827_0003
14/09/12 08:48:29 INFO mapred. JobClient: map 0% reduce 0%
14/09/12 08:48:47 INFO mapred. JobClient: map 50% reduce 0%
14/09/12 08:48:48 INFO mapred. JobClient: map 100% reduce 0%
14/09/12 08:48:57 INFO mapred. JobClient: map 100% reduce 33%
14/09/12 08:48:59 INFO mapred. JobClient: map 100% reduce 100%
14/09/12 08:49:02 INFO mapred. JobClient: Job complete: job_201409120827_0003
14/09/12 08:49:02 INFO mapred. JobClient: Counters: 30
14/09/12 08:49:02 INFO mapred. JobClient: Job Counters
14/09/12 08:49:02 INFO mapred. JobClient: Launched reduce tasks = 1
14/09/12 08:49:02 INFO mapred. JobClient: SLOTS_MILLIS_MAPS = 27285
14/09/12 08:49:02 INFO mapred. JobClient: Total time spent by all CES waiting after reserving slots (MS) = 0
14/09/12 08:49:02 INFO mapred. JobClient: Total time spent by all maps waiting after reserving slots (MS) = 0
14/09/12 08:49:02 INFO mapred. JobClient: Rack-local map tasks = 1
14/09/12 08:49:02 INFO mapred. JobClient: Launched map tasks = 2
14/09/12 08:49:02 INFO mapred. JobClient: Data-local map tasks = 1
14/09/12 08:49:02 INFO mapred. JobClient: SLOTS_MILLIS_REDUCES = 12080
14/09/12 08:49:02 INFO mapred. JobClient: File Output Format Counters
14/09/12 08:49:02 INFO mapred. JobClient: Bytes Written = 48
14/09/12 08:49:02 INFO mapred. JobClient: FileSystemCounters
14/09/12 08:49:02 INFO mapred. JobClient: FILE_BYTES_READ = 104
14/09/12 08:49:02 INFO mapred. JobClient: HDFS_BYTES_READ = 265
14/09/12 08:49:02 INFO mapred. JobClient: FILE_BYTES_WRITTEN = 177680
14/09/12 08:49:02 INFO mapred. JobClient: HDFS_BYTES_WRITTEN = 48
14/09/12 08:49:02 INFO mapred. JobClient: File Input Format Counters
14/09/12 08:49:02 INFO mapred. JobClient: Bytes Read = 45
14/09/12 08:49:02 INFO mapred. JobClient: Map-Reduce Framework
14/09/12 08:49:02 INFO mapred. JobClient: Map output materialized bytes = 110
14/09/12 08:49:02 INFO mapred. JobClient: Map input records = 2
14/09/12 08:49:02 INFO mapred. JobClient: Reduce shuffle bytes = 110
14/09/12 08:49:02 INFO mapred. JobClient: Spilled Records = 18
14/09/12 08:49:02 INFO mapred. JobClient: Map output bytes = 80
14/09/12 08:49:02 INFO mapred. JobClient: Total committed heap usage (bytes) = 248127488
14/09/12 08:49:02 INFO mapred. JobClient: CPU time spent (MS) = 8560
14/09/12 08:49:02 INFO mapred. JobClient: Combine input records = 9
14/09/12 08:49:02 INFO mapred. JobClient: SPLIT_RAW_BYTES = 220
14/09/12 08:49:02 INFO mapred. JobClient: Reduce input records = 9
14/09/12 08:49:02 INFO mapred. JobClient: Reduce input groups = 7
14/09/12 08:49:02 INFO mapred. JobClient: Combine output records = 9
14/09/12 08:49:02 INFO mapred. JobClient: Physical memory (bytes) snapshot = 322252800
14/09/12 08:49:02 INFO mapred. JobClient: Reduce output records = 7
14/09/12 08:49:02 INFO mapred. JobClient: Virtual memory (bytes) snapshot = 1042149376
14/09/12 08:49:02 INFO mapred. JobClient: Map output records = 9
Hadoop @ namenode :~ /Hadoop-1.2.1 $ hadoop fs-cat out /*
Warning: $ HADOOP_HOME is deprecated.
Heheh 1
Hello 2
It's 1
Ll 1
The 2
Think 1
Why 1
Cat: File does not exist:/user/hadoop/out/_ logs