Implementing truly distributed Hadoop is not pseudo-distributed.
First, System and configuration
A total of 2 machines were prepared to build Hadoop clusters. Based on the ubuntu14.04,jdk1.6.0_45,hadoop1.0.3 version, the virtual machine uses VMware10.0
192.168.1.10 NameNode Master (Master)
192.168.1.20 datenode slave1 (Slave)
My user name is Hadoop
The next step is to install some common software: vim,ssh
View Source print?
1.
sudo apt-get update
2.
sudo apt-get install vim
3.
sudo apt-get install ssh
First modify the native IP
Then enter the following command to modify the hosts
View Source print?
1.
sudo vim /etc/hosts
Then set up SSH, login without password, enter the following command, generate the secret key
View Source print?
1.
ssh-keygen -t rsa -P
""
Always press ENTER to generate the secret key, and then in the. SSH directory will generate Id_rsa and id_rsa.pub two files, the two files are SSH private key and public key.
Generate Authorized_keys Files: Cat id_rsa.pub >> Authorized_keys implements SSH with no password to log on to local localhost,
Ii. Installing Hadoop and JDK
Unzip the Hadoop tarball, my installation directory is/home/hadoop (this is the user name)/hadoop (this is the folder)/
View Source print?
1.
tar -zxvf hadoop1.
0.3
.ta
Unzip the JDK's compressed package, the command is similar to the above, but the name of the file, this is not listed here and then down is to modify the environment variable
View Source print?
1.
sudo vim /etc/profile
2.
export JAVA_HOME = /home/hadoop/hadoop/jdk1.
.0_45
3.
export PATH = $JAVA_HOME/bin:$PATH
4.
export HADOOP_HOME = /home/hadoop/hadoop/hadoop-
1.0
.
5.
export PATH = $JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
Don't forget to execute the source/etc/profile and let the path take effect immediately.
Finally configure the file modification in the Conf folder under Hadoop hadoop-env.sh
Modify CORE-SITE.XM
Modify Hdfs-site.xml
Modify Mapred-site.xml
Modify the Masters, and slaves files, in Masters write only Master is the above mentioned 192.168.1.0, slaves fill in the master and slave.
Then format the Namenode and write the following command in the hadoop-1.0.3 file
View Source print?
1.
bin/hadoop namenode -format
Right here has not mentioned slave configuration, in fact, especially simple, close the current virtual machine, a copy of just the virtual machine files, and then re-name, open again, modify the username and IP is good, my Ubuntu name is the same, as long as not a disk on the line.
Finally, enter the following command in the master (username, which is the main node of Ubuntu), also in the hadoop-1.0.3 file
View Source print?
1.
bin/start-all.sh
Then enter JPS to view the Java process, which is successful if the following 5 processes are present (excluding JPS)
can view Web pages
There are already two nodes, and this entire Hadoop distributed deployment is complete.
- This article is from: Linux Tutorial Network
Configuring the Hadoop environment under Ubuntu