Linux installation Configuration Hadoop

Last Update:2016-06-11 Source: Internet

Author: User

Tags ssh server hdfs dfs hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. INTRODUCTION

Refer to many tutorials on the web, and eventually install Hadoop in the ubuntu14.04 configuration successfully. The detailed installation steps are described below. The environment I use: two Ubuntu 14.04 64-bit desktops, Hadoop chooses the 2.7.1 version.

Two. Prepare for work 2.1 Create a user

To create a user and add root permissions to it, it is better to personally verify the following method.

1 sudo AddUser Hadoop 2 sudo vim/etc/sudoers3# Modify the contents as follows:4 root all = (All)all5 Hadoop all = (all) all

Create a directory for Hadoop users and add it to the Sudo user group with the following command:

1 sudo mkdir /home/hadoop2sudochown hadoop/home/hadoop3 # Add to sudo user group 4 sudo sudo

Finally log off the current user and log in with the newly created Hadoop user.

2.2 Installing the SSH service

There is no SSH server installed by default in Ubuntu (SSH client only), so run the following command to install Openssh-server first. Easy and pleasant installation process ~

sudo Install ssh Openssh-server

2.3 Configuring SSH Login without password

Directly on the code: after the execution of the following code can be directly logged in (you can run SSH localhost for verification)

1 cd ~/. SSH 2 Ssh-keygen -t RSA3cp id_rsa.pub Authorized_keys

Three. Installation process 3.1 Download the Hadoop installation package

There are two ways to download:

1. Go directly to the official website to download:

Http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz

2. Use the wget command to download:

wget http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz

3.2 Configuring Hadoop

1. Unzip the downloaded Hadoop installation package and modify the configuration file. My unzip directory is (/home/hadoop/hadoop-2.7.1), that is, go to the/home/hadoop/folder to perform the following decompression command.

tar -zxvf hadoop-2.7. 1. tar. gz

2. Modify the configuration file: (hadoop2.7.1/etc/hadoop/) directory, Hadoop-env.sh,core-site.xml,mapred-site.xml.template,hdfs-site.xml.

(1). Core-site.xml Configuration: The path of Hadoop.tmp.dir can be set according to your own habits.

<Configuration>< Property><name>Hadoop.tmp.dir</name><value>File:/home/hadoop/hadoop/tmp</value><Description>Abase for other temporary directories.</Description></ Property>< Property><name>Fs.defaultfs</name><value>hdfs://localhost:9000</value></ Property></Configuration>

(2). Mapred-site.xml.template Configuration:

<Configuration>< Property><name>Mapred.job.tracker</name><value>localhost:9001</value></ Property></Configuration>

(3). Hdfs-site.xml configuration: Where the paths of Dfs.namenode.name.dir and Dfs.datanode.data.dir can be set freely, preferably under the Hadoop.tmp.dir directory.

Note: If you find that the JDK is not found when you run Hadoop, you can place the JDK's path directly inside the hadoop-env.sh, as follows:

Export Java_home= "/opt/java_file/jdk1.7.0_79", which is the path when JAVA is installed.

<Configuration>< Property><name>Dfs.replication</name><value>1</value></ Property>< Property><name>Dfs.namenode.name.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/name</value></ Property>< Property><name>Dfs.datanode.data.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/data</value></ Property></Configuration>

Run Hadoop after the configuration is complete.

Four. Run hadoop4.1 to initialize the HDFS system

Execute the command in the hadop2.7.1 directory:

Bin/hdfs Namenode-format

The following results show that the initialization was successful.

4.2 Open NameNodeAnd DataNodeDaemon process

Execute the command in the hadop2.7.1 directory:

sbin/start-dfs.sh

Success is as follows:

4.3 Use the JPS command to view process information:

If the result appears, both Datanode and Namenode are turned on.

4.4 Viewing the Web interface

Enter http://localhost:50070 in the browser to view relevant information, as follows

At this point, the Hadoop environment has been built.

Five. Run WordCount Demo

1. Create a new file locally and fill in the contents: for example, I created a new haha.txt file in the Home/hadoop directory with the content "Hello world! "。

2. Then create a new test folder in the Distributed File System (HDFS) to upload our test file haha.txt. Run the command in the hadoop-2.7.1 directory:

# in the root directory of HDFs built a test directory bin/hdfs DFS-mkdir /test# View the directory structure under the HDFs root directory bin/hdfs DFS-ls /

The results are as follows:

3. Upload the local haha.txt file to the test directory;

# Upload bin/hdfs dfs-put/home/hadoop/haha.txt/test/# View bin/hdfs dfs-ls /test/

The results are as follows:

4. Run WordCount demo;

# Save the running results in the/test/outdirectory under the bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7. 1. jar wordcount/test/haha.txt/test/out# view files in/test/out directory bin/hdfs DFS-ls /test /out

The results are as follows:

The running result indicates that the operation was successful and the results were saved in part-r-00000.

5. View the results of the operation;

# viewrunning results in part-r-00000 bin/hadoop FS-cat /test/out/part-r-00000

The results are as follows:

At this point, WordCount demo run is finished.

Six. Summary

Configuration process encountered a lot of problems, and finally all resolved, a lot of harvest, hereby put the experience of this configuration to share, convenient for you want to configure the Hadoop environment of friends ~

Reference:

Http://www.tuicool.com/articles/bmeUneM

Linux installation Configuration Hadoop

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More