Build a fully distributed Hadoop-2.4.1 Environment
1. The configuration steps are as follows:
1. Build the host environment. Five virtual machines are used to build the Hadoop environment on Ubuntu 13.
2. Create a hadoop user group and a hadoop user, and assign permissions to the hadoop user.
3. Password-free login: Configure password-free login for five hosts.
4. Install hadoop and jdk, and configure environment variables.
5. Hadoop distributed environment parameter configuration.
Ii. host environment construction:
Install five virtual hosts on the Virtual Machine and install the hadoop system respectively. Then, complete the following operations respectively.
1. Set the static IP address. The virtual machine here uses the NAT Internet access mode. The configuration command is as follows:
A) Configure static addresses
Sudo gedit/etc/network/interfaces
Auto eth0
Iface eth0 inet static
Address 192.168.182.132
Netmask 255.255.255.0
Gateway 192.168.182.1
B) configure the Gateway
Sudo gedit/etc/resolv. conf
Nameserver 192.168.182.1
C) restart the network.
/Etc/init. d/networking restart
2. Modify the host name, open the hostname file, and change the host name to master, slave1, slave2, slave3, and slave4.
Sudo gedit/etc/hostname
3. Fix the host configuration file and open the hosts file. My host configuration is as follows:
Sudo gedit/etc/hosts
192.168.182.132 master
192.168.182.134 slave1
192.168.182.135 slave2
192.168.182.136 slave3
192.168.182.20.slave4
3. Add a user
1. Create a hadoop User Group
Sudo addgroup hadoop
2. Create a hadoop user
Sudo adduser-ingroup hadoop
3. add permissions to hadoop users to open the/etc/sudoers file
Sudo gedit/etc/sudoers
Press the Enter key to open the/etc/sudoers file and grant the hadoop user the same permissions as the root user.
Add hadoop ALL = (ALL: ALL) ALL under root ALL = (ALL: ALL) ALL
4. Configure password-free login from master to slave
1. Install the ssh service
Sudo apt-get install ssh openssh-server
2. Configure password-free login from the master node to the slave node (slave1 is used as an example here, and other slave operations are the same)
Master generates authorized_key
Ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa
Cd. ssh/
Cat id_dsa.pub> authorized_keys
Slave1 add master authorized_key to local
Scp yss @ master :~ /. Ssh/id_dsa.pub./master_dsa.pub
Cat master_dsa.pub> authorized_keys
5. Install hadoop and jdk and configure Environment Variables
The hadoop installation package version is 2.4.1, And the jdk version is 1.7.0 _ 65. Download the package from the official website.
Hadoop, jdk decompress to/home/hadoop/hadoop-2.4.1,/home/hadoop/jdk1.7.0 _ 65 directory, configure the environment variables are as follows:
Sudo gedit/etc/profile
HADOOP_HOME =/home/hadoop/hadoop-2.4.1
JAVA_HOME =/home/hadoop/jdk1.7.0 _ 65
PATH = $ JAVA_HOME/bin: $ HADOOP_HOME/bin: $ PATH
CLASSPATH =.: $ JAVA_HOME/lib: $ JAVA_HOME/jre/lib: $ HADOOP_HOME/lib: $ CLASSPATH
Export HADOOP_HOME
Export JAVA_HOME
Export PATH
Export CLASSPATH
Source/etc/profile
Note: environment variables should be configured in the last step. Each node must be configured separately.
6. Configure the hadoop Environment
Core-site.xml
<Configuration>
<Property>
<Name> fs. default. name </name>
<Value> hdfs: // master: 9000 </value>
<Final> true </final>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/home/hadoop/hadoop-2.4.1/tmp </value>
<Description> A base for other temporary directories. </description>
</Property>
</Configuration>
Hdfs-site.xml
<Configuration>
<Property>
<Name> dfs. name. dir </name>
<Values>/home/hadoop/hadoop-2.4.1/name </value>
<Final> true </final>
</Property>
<Property>
<Name> dfs. data. dir </name>
<Value>/home/hadoop/hadoop-2.4.1/data </value>
<Final> true </final>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 2 </value>
<Final> true </final>
</Property>
</Configuration>
Mapred-site.xml
<Configuration>
<Property>
<Name> mapred. job. tracker </name>
<Value> 192.168.182.132: 9001 </value>
</Property>
</Configuration>
Note: Steps 5 and 6 are only configured master nodes. After the master is configured, copy the/home/hadoop/folder to each slave.
Scp-r./hadoop slave1:/home
7. Start Hadoop
1. Format namenode
Run the following command on the master node:
Hadoop namenode format
2. Start the service
Go to the master node/home/hadoop/hadoop-2.4.1/sbin directory and execute the following command:
/Start-all.sh
The following command stops the Hadoop service:
/Stop-all.sh
3. The master has the following processes:
Hadoop @ master:/home/hadoop/hadoop-2.4.1/sbin $ jps
Jps 21211
7421 SecondaryNameNode
7154 NameNode
7968 ResourceManager
4. slave1 has the following processes:
Hadoop @ slave1:/home/hadoop/hadoop-2.4.1/sbin $ jps
3612 NameNode
Jps 3723
3367 DataNode
5. view the cluster status
Http: // master: 8088/
Install and configure Hadoop2.2.0 on CentOS
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)