The first step is to set up password-free SSH Login. This is for the convenience of using SCP to transfer files, directly Synchronize files and folders, and switch SSH to the corresponding server at any time.
Install SSH first. If SSH is not installed, perform the following operations.
1. Install and start SSH
1. Check whether OpenSSH
Command: # rpm-Qa | grep OpenSSH if it is installed, you can see the version number of the installation. Otherwise, it will not be installed.
2. Installation # rpm-IVH openssh-3.5p1-6
# Rpm-IVH openssh-server-3.5p1-6
# Rpm-IVH openssh-askpass-gnome-3.5p1-6
# Rpm-IVH openssh-clients-3.5p1-6
# Rpm-IVH openssh-askpass-3.5p1-6
3. Method 1: # service sshd start
Method 2: run the following command using the absolute path:
#/Etc/rc. d/init. d/sshd start
Or #/etc/rc. d/sshd start
4. Automatic Start method in addition, if you want to automatically run the service when the system starts, you need to use the setup command,
In the system service option, select the sshd daemon.
Chkconfig sshd on
Select sshd service through ntsysv
You can also set chkconfig -- level 3 sshd on through chkconfig.
5. Configure the SSH configuration file directory:/etc/ssh/sshd_config
Port 22
After the installation, we start to generate the SSH public key and private key.
I have four machines 192.168.250.195 192.168.250.197 192.168.250.200 192.168.250.196 last 196 as the master
So first log on to machine 196 through SSH and then execute the following command
ssh
-keygen -t dsa -P
‘‘
-f ~/.
ssh
/id_dsa
cat
~/.
ssh
/id_dsa.pub >> ~/.
ssh
/authorized_keys
scp ~/.ssh/authorized_keys [email protected] ~/.ssh/authorized_keys
The following is a previous attempt. For more information, see directly script SCP and pipeline.
Then configure and then use scp ssh from the master machine without entering the password
Then we modify and synchronize the hosts attribute file.
VI/etc/sysconfig/network respectively modify hostname = Master hostname = slave1... slave2 slave3
Modify the VI/etc/hosts file below
192.168.250.196 master
192.168.250.195 slave1
192.168.250.197 slave2
192.168.250.200 slave3
SCP synchronization below
Next we will download hadoop, modify the configuration file on the master, and synchronize SCP to other slave instances.
tar
-zxf hadoop-2.5.0.
tar
.gz -C /usr/
local
/
cd
/usr/
local
ln
-s hadoop-2.5.0 hadoop
Configure Environment Variables
vi /etc/profile
export HADOOP_PREFIX="/usr/local/hadoop"export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbinexport HADOOP_COMMON_HOME=${HADOOP_PREFIX}export HADOOP_HDFS_HOME=${HADOOP_PREFIX}export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}export HADOOP_YARN_HOME=${HADOOP_PREFIX}
Of course, you must install the Java environment
Then go to the hadoop directory CD/usr/local/hadoop to modify
VI/etc/hadoop/yarn-env.sh and hadoop-env.sh import Java environment
Export java_home =/usr/local/jdk8
Then
Modify the core-site.xml of a directory folder
<property><name>fs.defaultFS</name><value>hdfs://master:9000</value><description>The name of the default file system.</description></property>
Modifying the hdfs-site.xml remember Yes ///
Then modify the yarn-site.xml below
Configure resourcemanger of yarn as master
Then modify mapred-site.xml
By default, there is no mapred-site.xml file, just copy a mapred-site.xml.template for the mapred-site.xml
Then the configuration tells hadoop other slave nodes, so that as long as the master node starts, it will automatically start namenode datanode on other machines and so on.
VI/usr/local/hadoop/etc/hadoop/slaves
Add the following content
OK, all basic configurations have been completed.
The following figure shows how to synchronize the folder to other slave hosts, because we do not need to use a password for SSH Login-free.
SCP-r/usr/local/hadoop [email protected]:/usr/local/hadoop
SCP-r/usr/local/hadoop [email protected]:/usr/local/hadoop
SCP-r/usr/local/hadoop [email protected]:/usr/local/hadoop
OK
After the synchronization, we start to execute the format on the master node, that is, the usr/local/hadoop/sbin of the current master.
HDFS namenode-format
Then execute the start-dfs.sh in sequence
Execute start-yarn.sh
Can also be simple and crude direct start-all.sh
Then the JPS command can view the hadoop running status.
On the slave Node
Some users may find that ResourceManager is not started.
But don't worry
You only need to execute
Yarn-daemon.sh start nodemanager
OK. We can.
Use
View hadoop dfsadmin-Report status
Web Interface master: 50070/50030
OK.
You can use the test case to test it.
Also remember if there is a wrong http://blog.csdn.net/jiedushi/article/details/7496327 can refer to this blog
Hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount/user/wordcount/In/user/wordcount/out
Below are several images
Distributed installation and deployment of hadoop2.5.0 centos Series