Linux installation Configuration Hadoop

Source: Internet
Author: User
Tags ssh server hdfs dfs hadoop fs

I. INTRODUCTION

Refer to many tutorials on the web, and eventually install Hadoop in the ubuntu14.04 configuration successfully. The detailed installation steps are described below. The environment I use: two Ubuntu 14.04 64-bit desktops, Hadoop chooses the 2.7.1 version.

Two. Prepare for work 2.1 Create a user

To create a user and add root permissions to it, it is better to personally verify the following method.

1 sudo AddUser Hadoop 2 sudo vim/etc/sudoers3# Modify the contents as follows:4 root all = (All)all5 Hadoop all = (all) all

Create a directory for Hadoop users and add it to the Sudo user group with the following command:

1 sudo mkdir /home/hadoop2sudochown hadoop/home/hadoop3 # Add to sudo user group 4 sudo sudo

Finally log off the current user and log in with the newly created Hadoop user.

2.2 Installing the SSH service

There is no SSH server installed by default in Ubuntu (SSH client only), so run the following command to install Openssh-server first. Easy and pleasant installation process ~

sudo Install ssh Openssh-server
2.3 Configuring SSH Login without password

Directly on the code: after the execution of the following code can be directly logged in (you can run SSH localhost for verification)

1 cd ~/. SSH 2 Ssh-keygen -t RSA3cp id_rsa.pub Authorized_keys
Three. Installation process 3.1 Download the Hadoop installation package

There are two ways to download:

1. Go directly to the official website to download:

Http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz

2. Use the wget command to download:

wget http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz
3.2 Configuring Hadoop

1. Unzip the downloaded Hadoop installation package and modify the configuration file. My unzip directory is (/home/hadoop/hadoop-2.7.1), that is, go to the/home/hadoop/folder to perform the following decompression command.

tar -zxvf hadoop-2.7. 1. tar. gz

2. Modify the configuration file: (hadoop2.7.1/etc/hadoop/) directory, Hadoop-env.sh,core-site.xml,mapred-site.xml.template,hdfs-site.xml.

(1). Core-site.xml Configuration: The path of Hadoop.tmp.dir can be set according to your own habits.

<Configuration>< Property><name>Hadoop.tmp.dir</name><value>File:/home/hadoop/hadoop/tmp</value><Description>Abase for other temporary directories.</Description></ Property>< Property><name>Fs.defaultfs</name><value>hdfs://localhost:9000</value></ Property></Configuration>

(2). Mapred-site.xml.template Configuration:

<Configuration>< Property><name>Mapred.job.tracker</name><value>localhost:9001</value></ Property></Configuration>

(3). Hdfs-site.xml configuration: Where the paths of Dfs.namenode.name.dir and Dfs.datanode.data.dir can be set freely, preferably under the Hadoop.tmp.dir directory.

Note: If you find that the JDK is not found when you run Hadoop, you can place the JDK's path directly inside the hadoop-env.sh, as follows:

Export Java_home= "/opt/java_file/jdk1.7.0_79", which is the path when JAVA is installed.

<Configuration>< Property><name>Dfs.replication</name><value>1</value></ Property>< Property><name>Dfs.namenode.name.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/name</value></ Property>< Property><name>Dfs.datanode.data.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/data</value></ Property></Configuration>

Run Hadoop after the configuration is complete.

Four. Run hadoop4.1 to initialize the HDFS system

Execute the command in the hadop2.7.1 directory:

Bin/hdfs Namenode-format

The following results show that the initialization was successful.

4.2 Open NameNodeAnd DataNodeDaemon process

Execute the command in the hadop2.7.1 directory:

sbin/start-dfs.sh

Success is as follows:

4.3 Use the JPS command to view process information:

If the result appears, both Datanode and Namenode are turned on.

4.4 Viewing the Web interface

Enter http://localhost:50070 in the browser to view relevant information, as follows

At this point, the Hadoop environment has been built.

Five. Run WordCount Demo

1. Create a new file locally and fill in the contents: for example, I created a new haha.txt file in the Home/hadoop directory with the content "Hello world! "。

2. Then create a new test folder in the Distributed File System (HDFS) to upload our test file haha.txt. Run the command in the hadoop-2.7.1 directory:

# in the root directory of HDFs built a test directory bin/hdfs DFS-mkdir /test# View the directory structure under the HDFs root directory bin/hdfs DFS-ls /

The results are as follows:

3. Upload the local haha.txt file to the test directory;

# Upload bin/hdfs dfs-put/home/hadoop/haha.txt/test/# View bin/hdfs dfs-ls /test/

The results are as follows:

4. Run WordCount demo;

# Save the running results in the/test/outdirectory under the bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7. 1. jar wordcount/test/haha.txt/test/out# view files in/test/out directory bin/hdfs DFS-ls /test /out

The results are as follows:

The running result indicates that the operation was successful and the results were saved in part-r-00000.

5. View the results of the operation;

# viewrunning results in part-r-00000 bin/hadoop FS-cat /test/out/part-r-00000

The results are as follows:

At this point, WordCount demo run is finished.

Six. Summary

Configuration process encountered a lot of problems, and finally all resolved, a lot of harvest, hereby put the experience of this configuration to share, convenient for you want to configure the Hadoop environment of friends ~

Reference:

Http://www.tuicool.com/articles/bmeUneM

Linux installation Configuration Hadoop

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.