Cluster installation configuration Hadoop
Cluster nodes: Node4, Node5, Node6, Node7, Node8. Specific schema:
Operating system: CentOS release 5.5 (Final)
Installation steps
Create a Hadoop user group.
Second, the installation of JDK. Download the installation JDK. The installation directory is as follows:
Third, modify the machine name, modify the file hosts. Follows:
Iv. Install SSH service. Command: Yum install Openssh-server.
V. Establish SSH without password login.
(i) switch to a Hadoop user. Su–hadoop
(ii) Create Ssh-key, use the Ssh-keygen command, and use RSA to generate the key. Command: Ssh-keygen-t rsa-f ~/.ssh/id_rsa, generating public key: ~/.ssh/id_rsa.pub.
(iii) Add public key to Authorized_keys. Command:
Cat ~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys
(iv) Modify Authorized_keys file permissions:
(v) Edit sshd configuration file/etc/ssh/sshd_confi #http://www.aliyun.com/zixun/aggregation/11646.html ">authorizedkeysfile. The comment in front of the Ssh/authorized_keys is canceled.
(vi) Restart of the sshd service.
(vii) Copy the Authorized_keys file to another node (node5--8). Follows:
(eight) test SSH connection. When you connect, you are prompted to connect, press ENTER to add this public key to Knows_hosts, and command:
SSH localhost;
When the connection succeeds, remember exit and exit the remote machine.
Six, the Hadoop related program download uploaded to the node4.
The version of Hadoop used for download is 1.2.1.
Installation Configuration Hadoop
(a) landing node4, switching Hadoop users. Create installation directory, unzip Hadoop, command:
mkdir hadoop_program//Create Hadoop installation directory.
CP hadoop/hadoop-1.2.1.tar.gz hadoop_program///The Hadoop program CP to the Hadoop installation directory.
CD hadoop_program///CD to the directory.
TAR-XVF hadoop-1.2.1.tar.gz//Unzip the Hadoop program.
MV hadoop-1.2.1 Hadoop//Change the Hadoop directory name
(ii) Create environmental Hadoop-related environment variables.
Modify Conf/hadoop-env.sh. Find #export java_home= ..., remove the comment #, and then add the JDK path to the machine (the path installed in the second step), as follows:
Add hadoop_home environment variable, command: Vim ~/.BASHRC. Add the following:
(iii) Modification of the Hadoop configuration file
Modify the Conf/core-site.xml file.
modifying Mapred-site.xml files
Modify Hdfs-site.xml
modifying Masters Files
modifying Slaves files
VIII. Copy the configured Hadoop and JDK to other nodes:
Nine. The start Hadoop test was successfully installed.
Command: Hadoop namenode-format (first format namenode).
Command: start-all.sh (start Hadoop).
Command: CD to the JDK's Bin directory. Run the JPS command. Look:
NODE4:
Run normally.
Node5-8:
Run normally.
Ten. Encountered a problem.
1, the installation process found that node5-8 could not start Datanode and Tasktracker, and later found that the reason for the node5-8 machine has run the Java program. Use Ps-ef|grep Java to view, close the related process, then start Hadoop after the normal.
2, the use of the process, there are errors: Bad connect ack with Firstbadlink, solution:
1) '/etc/init.d/iptables stop '-->stopped firewall
2) selinux=disabled in '/etc/selinux/config ' file.-->disabled SELINUX
Source: http://blog.csdn.net/xia_yu_mao_fa/article/details/25144843