Hadoop Learning Note Two installing deployment

Source: Internet
Author: User
Tags config

Hardware environment

A total of 3 machines, all using the FC5 system, Java is using jdk1.6.0. The IP configuration is as follows:

dbrg-1:202.197.18.72
dbrg-2:202.197.18.73
dbrg-3:202.197.18.74

One thing to emphasize here is that it is important to ensure that each machine's hostname and IP address are resolved correctly.

A very simple test is to ping the host name, such as Ping dbrg-2 on the dbrg-1, if you can ping the ok! If not correctly resolved, you can modify the/etc/hosts file, if the machine for namenode use, you need to add in the Hosts file all the machines in the cluster IP address and its corresponding host name; If the machine is Datanode, You only need to add the native IP address and the IP address of the Namenode machine to the Hosts file.

For example, the/etc/hosts file in dbrg-1 should look like this:

127.0.0.0     localhost   localhost
202.197.18.72   dbrg-1    dbrg-1
202.197.18.73   dbrg-2    dbrg-2
202.197.18.74   dbrg-3    dbrg-3

The/etc/hosts file in dbrg-2 should look like this:

127.0.0.0     localhost  localhost
202.197.18.72   dbrg-1    dbrg-1
202.197.18.73   dbrg-2    dbrg-2

As mentioned in the previous study note, for Hadoop, in HDFs's view, nodes are divided into Namenode and Datanode, where Namenode only one, Datanode can be a lot; in MapReduce's view, Nodes are divided into Jobtracker and Tasktracker, of which jobtracker only one, Tasktracker can be a lot.

I was deploying Namenode and Jobtracker on Dbrg-1, dbrg-2,dbrg-3 as Datanode and Tasktracker. Of course, you can also deploy Namenode,datanode,jobtracker,tasktracker to a single machine.

Directory structure

Because Hadoop requires that the deployment directory structure of Hadoop on all machines be the same, and that all have an account with the same user name.

My three machines are like this: there is a DBRG account, the main directory is/HOME/DBRG

The Hadoop deployment directory structure is as follows:/home/dbrg/hadoopinstall, all versions of Hadoop are placed in this directory.

Unzip the hadoop0.12.0 compression pack into Hadoopinstall, and to facilitate later upgrades, it is recommended that you create a link to the version of Hadoop you want to use, as a Hadoop

[Dbrg@dbrg-1:hadoopinstall] $ln-S hadoop0.12.0 Hadoop

As a result, all of the configuration files are in the/hadoop/conf/directory, and all the execution programs are in the/hadoop/bin directory.

However, because the configuration files of Hadoop in the above directory are together with the installation directory of Hadoop, it is recommended that the configuration file be separated from the installation directory once all profiles are overwritten when the Hadoop version is later upgraded. A better approach would be to create a directory of configuration files,/home/dbrg/hadoopinstall/hadoop-config/, and then/hadoop/conf/the HADOOP_ in the directory Site.xml,slaves,hadoop_env.sh three files to the hadoop-config/directory (the question is very strange, on the official web getting started with Hadoop says it is only necessary to copy the three files to the directory that you created, but I found it necessary to copy the Masters file into the hadoop-conf/directory when I was actually configured. Otherwise, when you start HADOOP, you'll get an error saying you can't find Masters this file, and specify that the environment variable $hadoop_conf_dir point to the directory. Environment variables are set in/HOME/DBRG/.BASHRC and/etc/profile.

To sum up, in order to facilitate later upgrades, we need to do the configuration file and installation directory separation, and by setting a point to the version of the Hadoop we want to use the link, which can reduce our maintenance of the configuration file. In the following sections, you will experience the benefits of such separation and links.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.