Installation and configuration of Hadoop 2.7.3 under Ubuntu16.04

Source: Internet
Author: User
Tags mkdir safe mode ssh hdfs dfs hadoop fs

First, the Java environment to build

(1) Download the JDK and unzip it (the current operating system is UBUNTU16.04,JDK version jdk-8u111-linux-x64.tar.gz)

Create a new/usr/java directory, switch to the directory where jdk-8u111-linux-x64.tar.gz is located, and unzip the file into the/usr/java directory.

TAR-ZXVF jdk-8u101-linux-x64.tar.gz-c/usr/java/
(2) Setting environment variables

Modify. BASHRC to write the following in the last line.

sudo vim ~/.BASHRC
Export java_home=/usr/java/jdk1.8.0_111
export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export path= $JAVA _home/bin: $PATH

Run the following command to make the environment variable effective.

SOURCE ~/.BASHRC
Open the profile file and insert the Java Environment Configuration section.
sudo vim/etc/profile
Export java_home=/usr/java/jdk1.8.0_111
export java_bin= $JAVA _home/bin
export java_lib= $JAVA _home/lib
export classpath=.: $JAVA _lib/tools.jar: $JAVA _lib/dt.jar
export path= $JAVA _home/bin: $PATH


Open the Environment file, append the JDK directory and the Lib directory under the JDK as shown below.
sudo vim/etc/environment
Path= "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/java/ jdk1.8.0_111/lib:/usr/java/jdk1.8.0_111 "

Make configuration effective

Source/etc/environment

Verify that the Java environment is configured successfully
Java-version

Second, install Ssh-server and realize password-free login

(1) Download Ssh-server

sudo apt-get install Openssh-server
(2) Start SSH
Sudo/etc/init.d/ssh start
(3) Check whether the SSH service is started, if there is a display of the relevant SSH words to indicate success.
Ps-ef|grep SSH


(4) Set password-free login

Use the following command to keep the carriage return until RSA is generated.

SSH-KEYGEN-T RSA
Import Authorized_keys
Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Test whether password-free login localhost

SSH localhost
Shutting down the firewall
UFW Disable

Third, install Hadoop standalone mode and pseudo-distribution mode.

(1) Download hadoop-2.7.3.tar.gz, unzip to/usr/local (stand-alone mode build).

sudo tar zxvf hadoop-2.7.3.tar.gz-c/usr/local
Switch to/usr/local, rename hadoop-2.7.3 to Hadoop, and set access permissions for/usr/local/hadoop.
cd/usr/local
sudo mv hadoop-2.7.3 hadoop
sudo chmod 777/usr/local/hadoop
(2) Configuring the. bashrc file

sudo vim ~/.BASHRC

(if Vim is not installed, use the sudo apt install vim installation.) )

Append the following to the end of the file, and then save.

#HADOOP VARIABLES START
export java_home=/usr/java/jdk1.8.0_111
export Hadoop_install=/usr/local/hadoop
export path= $PATH: $HADOOP _install/bin
export path= $PATH: $HADOOP _install/sbin
Export hadoop_mapred_ Home= $HADOOP _install
export hadoop_common_home= $HADOOP _install
export hadoop_hdfs_home= $HADOOP _install
export yarn_home= $HADOOP _install
export hadoop_common_lib_native_dir= $HADOOP _install/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _install/lib"
#HADOOP VARIABLES END

Execute the following command to make the added environment variable effective:

SOURCE ~/.BASHRC

(3) Hadoop configuration (pseudo-distribution mode construction)

Configure hadoop-env.sh

sudo vim/usr/local/hadoop/etc/hadoop/hadoop-env.sh
# The Java implementation to use.
Export java_home=/usr/java/jdk1.8.0_111
export hadoop=/usr/local/hadoop
export path= $PATH:/usr/local/ Hadoop/bin


Configure yarn-env.sh
sudo vim/usr/local/hadoop/etc/hadoop/yarn-env.sh
# Export java_home=/home/y/libexec/jdk1.6.0/
java_home=/usr/java/jdk1.8.0_111

Configure Core-site.xml, create the/home/lyh/hadoop_tmp directory under the home directory, and then add the following in Core-site.xml.

sudo mkdir/home/lyh/hadoop_tmp
sudo vim/usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
        <!--Specify the communication address of the HDFs boss (Namenode)--
        <property>
                <name> fs.defaultfs</name>
                <value>hdfs://localhost:9000</value>
        </property>
        <! --Specify the storage directory where the Hadoop runtime generates files--
        <property>
                <name>hadoop.tmp.dir</name>
                <value >/home/lyh/hadoop_tmp</value>
        </property>
</configuration>


Configure Hdfs-site.xml
sudo vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
        <!--Specify the number of HDFs replicas--
        <property>
                <name>dfs.replication< /name>
                <value>1</value>
        </property>
</configuration>


Configure Yarn-site.xml
sudo vim/usr/local/hadoop/etc/hadoop/yarn-site.xml

<configuration> <!--Site specific YARN Configuration Properties--<property>
        <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.c Lass</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property > <property> <name>yarn.resourcemanager.address</name> <va lue>127.0.0.1:8032</value> </property> <property> <name>yarn.re Sourcemanager.scheduler.address</name> <value>127.0.0.1:8030</value> </proper
                Ty> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value&gT;127.0.0.1:8031</value> </property> </configuration>
 

(4) Shut down the system to restart.

Iv. test whether Hadoop is installed and configured successfully.

(1) Verify that Hadoop standalone mode installation is complete

Hadoop version
The ability to display the version number of Hadoop indicates that the standalone mode has been configured.
(2) Start HDFS usage as distribution mode.

Formatting Namenode

HDFs Namenode-format
There are "... has been successfully formatted" and so on appear that the format is successful. Note: Each format will generate a namenode corresponding ID, after multiple formatting, if the Datanode corresponding ID number is not changed, run WordCount will fail to upload the file to input.

Start HDFs

start-all.sh


Show process

JPs


Enter http://localhost:50070/in the browser, the following page appears

Enter http://localhost:8088/, the following page appears

Indicates that the pseudo-distribution installation configuration was successful.

Stop HDFs

stop-all.sh

V. Operation of WordCount

(1) Start HDFs.

start-all.sh
(2) View the directory of files contained under HDFs
Hadoop Dfs-ls/
If you run HDFs for the first time, nothing will be displayed.

(3) Create a file directory input in HDFs and upload the/usr/local/hadoop/readme.txt to input.

HDFs dfs-mkdir/input
Hadoop fs-put/usr/local/hadoop/readme.txt/input



(4) Execute the following command to run the WordCount and output the results to outputs.

Hadoop Jar/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar Wordcount/input/output

A page similar to the above shows that the WordCount ran successfully. Note: Replace the contents of the red wireframe in the diagram with the path information for your Hadoop-mapreduce-examples-2.7.3.jar file.

(5) After successful execution, the output directory will generate two files _success the successful flag, there is no content. One is part-r-00000, see the results of the execution by following the command, as shown below.

Hadoop fs-cat/output/part-r-00000


Attached: HDFs common commands

Hadoop fs-mkdir/tmp/input              new folder on HDFs
hadoop fs-put input1.txt/tmp/input  upload local file input1.txt to HDFs/tmp/ Input directory
Hadoop fs-get  input1.txt/tmp/input/input1.txt  pull HDFs files to local
Hadoop fs-ls/tmp/output                  List A directory for HDFs
Hadoop fs-cat/tmp/ouput/output1.txt  view files on HDFs
Hadoop fs-rmr/home/less/hadoop/tmp/output  Delete the directory on HDFs
hadoop dfsadmin-report View HDFs status, such as which datanode, each datanode situation
Hadoop Dfsadmin-safemode Leave  leave safe mode
Hadoop dfsadmin-safemode enter  Safe mode

Reference blog:

1.http://www.cnblogs.com/tatungzhang/p/5993138.html

2.http://blog.csdn.net/windghoul/article/details/52655032

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.