Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux

Source: Internet
Author: User

Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux

Hadoop installation is very simple. You can download the latest versions from the official website. It is best to use the stable version. In this example, three machine clusters are installed. The hadoop version is as follows:

Tools/Raw Materials
  • Hadoop-0.20.2.tar.gz
  • Ubuntu12.10
Installation steps:
  1. 1

    Install ubuntu 

    Ubuntu12.10 swap space 4 GB (memory 2 GB ). The installation process is not described in detail.

    Uniform user names: hadoop;

    GROUP: hadoop;

    Machine name: namenode (master node), datanode1 (slave Node 1), datanode2 (slave Node 2 ).

  2. 2

    Create a hadoop user group and user in Ubuntu (you can also configure it through the interface when installing the system)

    1. Create a hadoop user group;

    $ Sudo addgroup hadoop

    Write down the password set for the hadoop user, which will be used later when logging on to the hadoop user.

    2. Create a hadoop user;

    $ Sudo adduser-ingroup hadoop

    3. add permissions to hadoop users to open the/etc/sudoers file;

    $ Sudo gedit/etc/sudoers

    4. Grant the root user the same permissions to the hadoop user.

    Add the following content under root ALL = (ALL: ALL) ALL:

    Hadoop ALL = (ALL: ALL) ALL

  3. 3

    Install JDK in Ubuntu

    Configure environment variables:

    $ Sudo gedit/etc/profile

    Add the following content at the end of the file to save and close the file.

    # For java

    Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0

    Export JRE_HOME =$ {JAVA_HOME}/jre

    Export CLASSPATH =. :$ {JAVA_HOME}/lib :$ {JRE_HOME}/lib

    Export PATH =$ {JAVA_HOME}/bin :$ {JRE_HOME}/bin: $ PATH

     

    Make the settings take effect:The configuration of this JDK is complete.

    $ Source/etc/profile

    To ensure correctness, perform a simple test

    $ Java-version

    Output:

    Java version "1.8.0"

    Java (TM) SE Runtime Environment (build 1.8.0)

    Java HotSpot (TM) Server VM

     

    Modify the default JDK:

    Sudo update-alternatives -- install/usr/bin/java/Usr/lib/jvm/jdk1.8.0/bin/In java 300

    Sudo update-alternatives -- install/usr/bin/javac/Usr/lib/jvm/jdk1.8.0/bin/Javac 300

    Sudo update-alternatives -- install/usr/bin/jar/Usr/lib/jvm/jdk1.8.0/bin/Jar2 300

    The following two sentences are optional (preferably executed ):

    Sudo update-alternatives -- config java

    Sudo update-alternatives -- config javac

  4. 4

    Modify the machine name (this step can be omitted)

    When ubuntu is successfully installed, the default machine name is ubuntu. However, to make it easy to distinguish servers in the cluster, you need to give different names to each machine. The machine name is determined by the/etc/hostname file.

    1. Open the/etc/hostname file;

    $ Sudo gedit/etc/hostname

    2. Change ubuntu in the/etc/hostname file to the corresponding machine, such as the master node, to "namenode ".Restart takes effect.

     

    3. Modify the/etc/hosts file

    $ Sudo gedit/etc/hosts

    As shown below,Set the IP address of each machine to a fixed IP address.:

  5. 5

    Install the ssh service

    Ssh here does not refer to three major frameworks: spring, struts, and hibernate, but a remote logon protocol.

    Generally, only the ssh-agent is installed on ubuntu by default. You can run the following command:

    $ Sudo ps-ef | grep ssh

    If you only have ssh-agent, you need to install openssh-server.

    $ Sudo apt-get install ssh openssh-server

  6. 6

    Create an ssh password-less login to the Local Machine

    First, convert to a hadoop user and execute the following command:

    $ Sudo su-hadoop

     

    Ssh key generation methods include rsa and dsa. By default, rsa is used.

    1. Create an ssh-key. Here we use the rsa method;

    $ Ssh-keygen-t rsa-P ""

    Note that ssh-kengen is connected by a hyphen and must not be separated.The problem is as follows:

    According to your step-by-step authentication process, the dot I generated in ssh-keygen-t rsa-P "" is displayed on the terminal after the hacker attempts to access bad escape character "ygen" (the red part is the problem, it separates ssh-keygen .)

    (Note: After you press enter ~ Two files are generated under/. ssh/: id_rsa and id_rsa.pub)

     

    2. Create the authorization file authorized_keys

    Go ~ The/. ssh/directory does not contain the authorized_keys file. You can use either of the following methods:

    (1) append id_rsa.pub to the authorized_keys authorization file;

    $ Cd ~ /. Ssh

    $ Cat id_rsa.pub> authorized_keys

    (2) Copy id_rsa.pub to authorized_keys.

    $ Cp ~ /. Ssh/id_rsa.pub ~ /. Ssh/authorized_keys

    3. log on to localhost;

    $ Ssh localhost

    4. Execute the exit command (exit or CTRL + D );

  7. 7

    Install hadoop

     

     

    1. Set hadoop-0.4102.tar.gz on the desktop and copy it to the installation directory./Usr/local/;

    $ Sudo cp hadoop-0.20.2.tar.gz/usr/local/

    2. Decompress hadoop-0.4102.tar.gz;

    $ Cd/usr/local

    $ Sudo tar-zxvf hadoop-0.20.2.tar.gz

    3. Rename the decompressed folder to hadoop;

    $ Sudo mv hadoop-0.20.2 hadoop

    4. Set the owner of the hadoop folder to hadoop,

    $ Sudo chown-R hadoop: hadoop

    5. Open the hadoop/conf/hadoop-env.sh file;

    $ Sudo gedit hadoop/conf/hadoop-env.sh

    6. Configure conf/hadoop-env.sh (find # exportJAVA_HOME =..., remove #, and then addLocal jdk path);

    Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0

    7. Open the conf/core-site.xml file;

    $ Sudo gedit hadoop/conf/core-site.xml

     

    <Configuration>

    <Property>

    <Name> fs. default. name </name>

    <Value> hdfs :/// namenode: 9000 </value>

    </Property>

    </Configuration>

    8. Open the conf/mapred-site.xml file;

    $ Sudo gedit hadoop/conf/mapred-site.xml

     

    <Configuration>

    <Property>

    <Name> mapred. job. tracker </name>

    <Value> namenode: 9001 </value>

    </Property>

    </Configuration>

     

    9. Open the conf/hdfs-site.xml file;

    $ Sudo gedit hadoop/conf/hdfs-site.xml

     

    <Configuration>

    <Property>

    <Name> dfs. name. dir </name>

    <Value>/usr/local/hadoop/datalog1,/usr/local/hadoop/datalog2 </value>

    </Property>

    <Property>

    <Name> dfs. data. dir </name>

    <Value>/usr/local/hadoop/data1,/usr/local/hadoop/data2 </value>

    </Property>

    <Property>

    <Name> dfs. replication </name>

    <Value> 3 </value>

    </Property>

    </Configuration>

    Note: The datalog1, datalog2, data1, and data2 folders should be created in advance.

    10. Open the conf/masters file and add the host name as secondarynamenode.

    11. Open the conf/slaves file and add one row as the Server Load balancer host name.

    The author's settings are as follows:

  8. 8

    Run hadoop

     

    1. Go to the hadoop directory and format the hdfs file system. This operation is required when you first run hadoop,

    $ Cd/usr/local/hadoop/

    $ Bin/hadoop namenode-format

    2. Start bin/start-all.sh

    Go to the bin directory, $./start-all.sh close: Same directory./stop-all.sh

    3. Check whether hadoop is successfully started.

    $ Jps

    4. Web View

    Http: // localhost: 50030/-a web access interface of jobtracker, in order to facilitate users to view relevant information.

    Similar:

    Http: // localhost: 50070/-web UI of theNameNode daemon

    Http: // localhost: 50030/-web UI of theJobTracker daemon

    Http: // localhost: 50060/-web UI of theTaskTracker daemon

     

    END
Notes
  • The username and group of all machines in the cluster must be the same. The username in this article is hadoop and the group is Hadoop.
  • The IP address of each machine is set to a fixed ip address, and the host name-IP address needs to modify the corresponding hosts file in each machine.
  • Note that the bold italic in the Code is the specific installation directory, which must be the same as the actual directory.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.