HBase entry notes (III)-full distribution mode Hadoop cluster installation Configuration

Source: Internet
Author: User

This article was originally written at the end of last November and was troubled by a variety of complicated things. Moreover, there were a lot of well-configured articles on the Internet, but it was still not clear. Still write a Step By Step tutorial for your use.

The Hadoop deployment environment is the virtualized four hosts, and the OS is Ubuntu Server10.04. (XenServer5.6 compatible OS does not include Ubuntu. Converting Ubuntu to PV is also a tough process. This article also introduces ). The version of Hadoop is 0.20.2. Install the Java environment as shown in the previous section.

The host name and its IP address correspond to the following:

Slave & TaskTracker: dm1, IP: 192.168.0.17; (datanode)

Slave & TaskTracker: dm2, IP: 192.168.0.18; (datanode)

Slave & TaskTracker: dm3, IP: 192.168.0.9; (datanode)

Master & JobTracker: dm4, IP: 192.168.0.10; (namenode)

The Master is the management node of the Hadoop cluster, and important configuration work is on it. For its functions and functions, see HadoopAPI.

The configuration steps are as follows:

I. Modify the HostName of each node (dm1-dm4), the command is as follows:

?
1 Vi /etc/hostname

For example:

 

II. Add the host Name and IP address to the host for communication. The Master must know all slave information. The corresponding slave only needs to know the Master and its own information.

The command is as follows:

?
1 vi /etc/hosts

The hosts configuration of the Master (dm4) should be shown in:

 

Other slave (dm3 ......) The hosts configuration of is shown in:

 

3.. Hadoop core code needs to configure the core-site.xml, hdfs-site.xml, mapread-site.xml, mapred-site.xml, hadoop-env.sh In the conf folder. For more information about the configurations, see the Hadoop help documentation.

1. First edit the core-site.xml file for each machine node (including master and slave), the command is as follows: (put the Hadoop folder under home)

?
1 vi /home/hadoop/conf/core-site.xml

The core-site.xml file configuration should be shown in the following code:

?
1234567891011 <?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration><property>    <name>fs.default.name</name>    <value>hdfs://dm4:9000</value> </property></configuration>

2. Second, edit the hdfs-site.xml for each machine node (including master and slave), the command is as follows:

?
1 vi /home/hadoop/conf/hdfs-site.xml

The hdfs-site.xml file configuration should be shown in the following code:

?
12345678910 <?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/home/hadoop/NameData</value> </property>
?
12345678910 <property> <name>dfs.permissions</name> <value>false</value> </property>  <property> <name>dfs.replication</name> <value>1</value> </property></configuration>

3. Again, edit the mapred-site.xml file for each machine node (including master and slave) with the following command:

?
1 vi /home/hadoop/conf/mapred-site.xml

The mapred-site.xml file configuration should be shown in the following code:

?
1234567891011 <?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration><property><name>mapred.job.tracker</name><value>192.168.0.10:9001</value></property></configuration>

4. Finally, edit the hadoop-env.sh files for each machine node (including master and slave) with the command below:

?
1 vi  /home/hadoop/conf/hadoop-env.sh

Add several lines of code to the file, as shown below:

 

?
123 export HADOOP_HOME=/home/hadoopexport   HADOOP_CONF_DIR=$HADOOP_HOME/conf export PATH=$PATH:$HADOOP_HOME/bin

Thu. Configure the master-slave relationship of the cluster. On all machine nodes, Hadoop's conf folder contains two files: slaves and masters. Add the IP address or hostname of the Master (dm4) to the masters. Add the IP or hostname of the Slave (dm1-dm3) to the slaves file. All nodes must be modified.

 

Shows Masters:

 

Shows slaves:

 

Now, the overall installation and configuration of Hadoop have been completed. The start of the Hadoop cluster starts from the Master (Namenode) machine. It communicates with the slave (DataNode) Using ssh. Next we need to set ssh password-less public key authentication for login.

V.For more information about the principles of. SSH asymmetric keys, see this document. FirstAll nodesTo generate a key pair, follow these steps:

1.All nodesTo generate an RSA key pair, run the following command:

?
1 ssh-keygen -t rsa

As shown in:

 

Press enter to store the key pair as/root/. ssh/id_rsa. Generate/root/viki. pub in the demonstration in this article and then ask you to enter the password and select null

 

The final generation is as follows:

 

2. SetMaster (Namenode)The contents of the generated public key viki. pub are copied to the/root/. ssh/authorized_keys file on the local machine. The command is as follows:

?
1 cp viki.pub authorized_keys

Then, copy the authorized_keys file to eachSlave (DataNode)The/root/. ssh/folder of the machine. The command is as follows:

?
1 scp  /root/.ssh/authorized_keys dm3:/root/.ssh/

FinallyAll machinesRun the user permission command chmod as follows:

 

?
1 chmod 644 authorized_keys

After the above steps, the ssh configuration is complete. Verify with the following command:

 

?
123456 ssh dm3exitssh dm2exitssh dm1exit

The password is required for the first connection. Enter "yes" and "Machine Password. You do not need to enter it later.

 

Sat.. Start and verify the Hadoop cluster, as described above. Input: http: // 192.168.0.10: 50030/jobtracker. jsp

The following Hadoop cluster is created:

 

 

VII. References

 

1. Hadoop Quickstart http://hadoop.apache.org/common/docs/r0.18.2/cn/quickstart.html

2. common thread: OpenSSH Key Management

 

Http://www.ibm.com/developerworks/cn/linux/security/openssh/part1/index.html

 

Address: http://www.cnblogs.com/ventlam/archive/2011/01/21/hadoopcluster.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.