Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux

Last Update:2016-11-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop installation is very simple. You can download the latest versions from the official website. It is best to use the stable version. In this example, three machine clusters are installed. The hadoop version is as follows:

Tools/Raw Materials

Hadoop-0.20.2.tar.gz
Ubuntu12.10

Installation steps:

1
Install ubuntu
Ubuntu12.10 swap space 4 GB (memory 2 GB ). The installation process is not described in detail.
Uniform user names: hadoop;
GROUP: hadoop;
Machine name: namenode (master node), datanode1 (slave Node 1), datanode2 (slave Node 2 ).
2
Create a hadoop user group and user in Ubuntu (you can also configure it through the interface when installing the system)
1. Create a hadoop user group;
$ Sudo addgroup hadoop
Write down the password set for the hadoop user, which will be used later when logging on to the hadoop user.
2. Create a hadoop user;
$ Sudo adduser-ingroup hadoop
3. add permissions to hadoop users to open the/etc/sudoers file;
$ Sudo gedit/etc/sudoers
4. Grant the root user the same permissions to the hadoop user.
Add the following content under root ALL = (ALL: ALL) ALL:
Hadoop ALL = (ALL: ALL) ALL
3
Install JDK in Ubuntu
Configure environment variables:
$ Sudo gedit/etc/profile
Add the following content at the end of the file to save and close the file.
# For java
Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0
Export JRE_HOME =$ {JAVA_HOME}/jre
Export CLASSPATH =. :$ {JAVA_HOME}/lib :$ {JRE_HOME}/lib
Export PATH =$ {JAVA_HOME}/bin :$ {JRE_HOME}/bin: $ PATH

Make the settings take effect:The configuration of this JDK is complete.
$ Source/etc/profile
To ensure correctness, perform a simple test
$ Java-version
Output:
Java version "1.8.0"
Java (TM) SE Runtime Environment (build 1.8.0)
Java HotSpot (TM) Server VM

Modify the default JDK:
Sudo update-alternatives -- install/usr/bin/java/Usr/lib/jvm/jdk1.8.0/bin/In java 300
Sudo update-alternatives -- install/usr/bin/javac/Usr/lib/jvm/jdk1.8.0/bin/Javac 300
Sudo update-alternatives -- install/usr/bin/jar/Usr/lib/jvm/jdk1.8.0/bin/Jar2 300
The following two sentences are optional (preferably executed ):
Sudo update-alternatives -- config java
Sudo update-alternatives -- config javac
4
Modify the machine name (this step can be omitted)
When ubuntu is successfully installed, the default machine name is ubuntu. However, to make it easy to distinguish servers in the cluster, you need to give different names to each machine. The machine name is determined by the/etc/hostname file.
1. Open the/etc/hostname file;
$ Sudo gedit/etc/hostname
2. Change ubuntu in the/etc/hostname file to the corresponding machine, such as the master node, to "namenode ".Restart takes effect.

3. Modify the/etc/hosts file
$ Sudo gedit/etc/hosts
As shown below,Set the IP address of each machine to a fixed IP address.:
5
Install the ssh service
Ssh here does not refer to three major frameworks: spring, struts, and hibernate, but a remote logon protocol.
Generally, only the ssh-agent is installed on ubuntu by default. You can run the following command:
$ Sudo ps-ef | grep ssh
If you only have ssh-agent, you need to install openssh-server.
$ Sudo apt-get install ssh openssh-server
6
Create an ssh password-less login to the Local Machine
First, convert to a hadoop user and execute the following command:
$ Sudo su-hadoop

Ssh key generation methods include rsa and dsa. By default, rsa is used.
1. Create an ssh-key. Here we use the rsa method;
$ Ssh-keygen-t rsa-P ""
Note that ssh-kengen is connected by a hyphen and must not be separated.The problem is as follows:
According to your step-by-step authentication process, the dot I generated in ssh-keygen-t rsa-P "" is displayed on the terminal after the hacker attempts to access bad escape character "ygen" (the red part is the problem, it separates ssh-keygen .)
(Note: After you press enter ~ Two files are generated under/. ssh/: id_rsa and id_rsa.pub)

2. Create the authorization file authorized_keys
Go ~ The/. ssh/directory does not contain the authorized_keys file. You can use either of the following methods:
(1) append id_rsa.pub to the authorized_keys authorization file;
$ Cd ~ /. Ssh
$ Cat id_rsa.pub> authorized_keys
(2) Copy id_rsa.pub to authorized_keys.
$ Cp ~ /. Ssh/id_rsa.pub ~ /. Ssh/authorized_keys
3. log on to localhost;
$ Ssh localhost
4. Execute the exit command (exit or CTRL + D );
7
Install hadoop

1. Set hadoop-0.4102.tar.gz on the desktop and copy it to the installation directory./Usr/local/;
$ Sudo cp hadoop-0.20.2.tar.gz/usr/local/
2. Decompress hadoop-0.4102.tar.gz;
$ Cd/usr/local
$ Sudo tar-zxvf hadoop-0.20.2.tar.gz
3. Rename the decompressed folder to hadoop;
$ Sudo mv hadoop-0.20.2 hadoop
4. Set the owner of the hadoop folder to hadoop,
$ Sudo chown-R hadoop: hadoop
5. Open the hadoop/conf/hadoop-env.sh file;
$ Sudo gedit hadoop/conf/hadoop-env.sh
6. Configure conf/hadoop-env.sh (find # exportJAVA_HOME =..., remove #, and then addLocal jdk path);
Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0
7. Open the conf/core-site.xml file;
$ Sudo gedit hadoop/conf/core-site.xml

<Configuration>
<Property>
<Name> fs. default. name </name>
<Value> hdfs :/// namenode: 9000 </value>
</Property>
</Configuration>
8. Open the conf/mapred-site.xml file;
$ Sudo gedit hadoop/conf/mapred-site.xml

<Configuration>
<Property>
<Name> mapred. job. tracker </name>
<Value> namenode: 9001 </value>
</Property>
</Configuration>

9. Open the conf/hdfs-site.xml file;
$ Sudo gedit hadoop/conf/hdfs-site.xml

<Configuration>
<Property>
<Name> dfs. name. dir </name>
<Value>/usr/local/hadoop/datalog1,/usr/local/hadoop/datalog2 </value>
</Property>
<Property>
<Name> dfs. data. dir </name>
<Value>/usr/local/hadoop/data1,/usr/local/hadoop/data2 </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 3 </value>
</Property>
</Configuration>
Note: The datalog1, datalog2, data1, and data2 folders should be created in advance.
10. Open the conf/masters file and add the host name as secondarynamenode.
11. Open the conf/slaves file and add one row as the Server Load balancer host name.
The author's settings are as follows:
8
Run hadoop

1. Go to the hadoop directory and format the hdfs file system. This operation is required when you first run hadoop,
$ Cd/usr/local/hadoop/
$ Bin/hadoop namenode-format
2. Start bin/start-all.sh
Go to the bin directory, $./start-all.sh close: Same directory./stop-all.sh
3. Check whether hadoop is successfully started.
$ Jps
4. Web View
Http: // localhost: 50030/-a web access interface of jobtracker, in order to facilitate users to view relevant information.
Similar:
Http: // localhost: 50070/-web UI of theNameNode daemon
Http: // localhost: 50030/-web UI of theJobTracker daemon
Http: // localhost: 50060/-web UI of theTaskTracker daemon

END

Notes

The username and group of all machines in the cluster must be the same. The username in this article is hadoop and the group is Hadoop.
The IP address of each machine is set to a fixed ip address, and the host name-IP address needs to modify the corresponding hosts file in each machine.
Note that the bold italic in the Code is the specific installation directory, which must be the same as the actual directory.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support