Hadoop-2.4.1 Fully distributed environment construction
First, the configuration steps are as follows:
- Host environment, here is the use of 5 virtual machines, on the Ubuntu 13 system to build the Hadoop environment.
- Create Hadoop user groups and Hadoop users, and assign permissions to Hadoop users.
- Password-free login, 5 host for password-free login configuration.
- Install Hadoop and JDK, and configure environment variables.
- Hadoop distributed Environment parameter configuration.
Second, the host environment to build:
Install 5 virtual hosts on the virtual machine and install the Hadoop system separately, and then do the following separately.
- Set the static IP address, where the virtual machine uses NAT Internet mode, the configuration command is as follows:
A) Configure static addresses
sudo gedit/etc/network/interfaces
Auto Eth0
Iface eth0 inet Static
Address 192.168.182.132
Netmask 255.255.255.0
Gateway 192.168.182.1
b) Configure the Gateway
sudo gedit/etc/resolv.conf
NameServer 192.168.182.1
c) Restart the network
/etc/init.d/networking restart
- Modify the host name, open the hostname file, and change the hostname to MASTER,SLAVE1,SLAVE2,SLAVE3,SLAVE4, respectively
sudo gedit/etc/hostname
- To fix the host configuration file, open the Hosts file, my host configuration is this:
sudo gedit/etc/hosts
192.168.182.132 Master
192.168.182.134 slave1
192.168.182.135 Slave2
192.168.182.136 Slave3
192.168.182.137 Slave4
Third, add users
- Create a Hadoop user group
sudo addgroup Hadoop
- Create a Hadoop user
sudo adduser-ingroup Hadoop Hadoop
- Add permissions to a Hadoop user to open the/etc/sudoers file
sudo gedit/etc/sudoers
Pressing ENTER will open the/etc/sudoers file, giving the Hadoop user the same privileges as the root user
Add Hadoop all= (All:all) all under root all= (All:all) all
Four, configure the master to slave password-free login
- Installing the SSH Service
sudo apt-get install SSH openssh-server
- Configure the master to slave password-free login (here slave1 for example, other slave operations consistent)
Master Build Authorized_key
Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA
CD. ssh/
Cat Id_dsa.pub >> Authorized_keys
Slave1 adding Authorized_key of master to local
SCP [email protected]:~/.ssh/id_dsa.pub./master_dsa.pub
Cat Master_dsa.pub >> Authorized_keys
V. Installing Hadoop and JDK, configuring environment variables
The Hadoop installation package version is 2.4.1,JDK using the 1.7.0_65 version, respectively, to the official website to download.
HADOOP,JDK extracted to the/home/hadoop/hadoop-2.4.1,/home/hadoop/jdk1.7.0_65 directory, the configuration environment variables are as follows:
sudo gedit/etc/profile
hadoop_home=/home/hadoop/hadoop-2.4.1
Java_home=/home/hadoop/jdk1.7.0_65
Path= $JAVA _home/bin: $HADOOP _home/bin: $PATH
Classpath=.: $JAVA _home/lib: $JAVA _home/jre/lib: $HADOOP _home/lib: $CLASSPATH
Export Hadoop_home
Export Java_home
Export PATH
Export CLASSPATH
Source/etc/profile
Note: The configuration environment variable should be in the final step, each node needs to be configured separately
Vi. Configuring the Hadoop environment
Core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.4.1/tmp</value>
<description>a base for other temporary directories.</description>
</property>
</configuration>
Hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-2.4.1/name</value>
<final>true</final>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-2.4.1/data</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<final>true</final>
</property>
</configuration>
Mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.182.132:9001</value>
</property>
</configuration>
Note: The five or six step is only the configured master node, and the master configuration is completed to copy the/home/hadoop/folder to each slave
Scp-r./hadoop Slave1:/home
Seven, start Hadoop
- Formatting Namenode
Execute the following command on the master node:
Hadoop namenode Format
- Start the service
Go to the Master node/home/hadoop/hadoop-2.4.1/sbin directory and execute the following command:
./start-all.sh
Here is the Stop Hadoop Service command:
./stop-all.sh
- Master has the following processes:
[Email protected]:/home/hadoop/hadoop-2.4.1/sbin $JPS
21211 Jps
7421 Secondarynamenode
7154 NameNode
7968 ResourceManager
- SLAVE1 has the following process
[Email protected] slave1:/home/hadoop/hadoop-2.4.1/sbin $jps
3612 NameNode
3723 Jps
3367 DataNode
- View cluster status
http://master:8088/
Hadoop-2.4.1 Fully distributed environment construction