First, the Java environment to build
(1) Download the JDK and unzip it (the current operating system is UBUNTU16.04,JDK version jdk-8u111-linux-x64.tar.gz)
Create a new/usr/java directory, switch to the directory where jdk-8u111-linux-x64.tar.gz is located, and unzip the file into the/usr/java directory.
TAR-ZXVF jdk-8u101-linux-x64.tar.gz-c/usr/java/
(2) Setting environment variables
Modify. BASHRC to write the following in the last line.
sudo vim ~/.BASHRC
Export java_home=/usr/java/jdk1.8.0_111
export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export path= $JAVA _home/bin: $PATH
Run the following command to make the environment variable effective.
SOURCE ~/.BASHRC
Open the profile file and insert the Java Environment Configuration section.
sudo vim/etc/profile
Export java_home=/usr/java/jdk1.8.0_111
export java_bin= $JAVA _home/bin
export java_lib= $JAVA _home/lib
export classpath=.: $JAVA _lib/tools.jar: $JAVA _lib/dt.jar
export path= $JAVA _home/bin: $PATH
Open the Environment file, append the JDK directory and the Lib directory under the JDK as shown below.
sudo vim/etc/environment
Path= "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/java/ jdk1.8.0_111/lib:/usr/java/jdk1.8.0_111 "
Make configuration effective
Source/etc/environment
Verify that the Java environment is configured successfully
Java-version
Second, install Ssh-server and realize password-free login
(1) Download Ssh-server
sudo apt-get install Openssh-server
(2) Start SSH
Sudo/etc/init.d/ssh start
(3) Check whether the SSH service is started, if there is a display of the relevant SSH words to indicate success.
Ps-ef|grep SSH
(4) Set password-free login
Use the following command to keep the carriage return until RSA is generated.
SSH-KEYGEN-T RSA
Import Authorized_keys
Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Test whether password-free login localhost
SSH localhost
Shutting down the firewall
UFW Disable
Third, install Hadoop standalone mode and pseudo-distribution mode.
(1) Download hadoop-2.7.3.tar.gz, unzip to/usr/local (stand-alone mode build).
sudo tar zxvf hadoop-2.7.3.tar.gz-c/usr/local
Switch to/usr/local, rename hadoop-2.7.3 to Hadoop, and set access permissions for/usr/local/hadoop.
cd/usr/local
sudo mv hadoop-2.7.3 hadoop
sudo chmod 777/usr/local/hadoop
(2) Configuring the. bashrc file
sudo vim ~/.BASHRC
(if Vim is not installed, use the sudo apt install vim installation.) )
Append the following to the end of the file, and then save.
#HADOOP VARIABLES START
export java_home=/usr/java/jdk1.8.0_111
export Hadoop_install=/usr/local/hadoop
export path= $PATH: $HADOOP _install/bin
export path= $PATH: $HADOOP _install/sbin
Export hadoop_mapred_ Home= $HADOOP _install
export hadoop_common_home= $HADOOP _install
export hadoop_hdfs_home= $HADOOP _install
export yarn_home= $HADOOP _install
export hadoop_common_lib_native_dir= $HADOOP _install/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _install/lib"
#HADOOP VARIABLES END
Execute the following command to make the added environment variable effective:
SOURCE ~/.BASHRC
(3) Hadoop configuration (pseudo-distribution mode construction)
Configure hadoop-env.sh
sudo vim/usr/local/hadoop/etc/hadoop/hadoop-env.sh
# The Java implementation to use.
Export java_home=/usr/java/jdk1.8.0_111
export hadoop=/usr/local/hadoop
export path= $PATH:/usr/local/ Hadoop/bin
Configure yarn-env.sh
sudo vim/usr/local/hadoop/etc/hadoop/yarn-env.sh
# Export java_home=/home/y/libexec/jdk1.6.0/
java_home=/usr/java/jdk1.8.0_111
Configure Core-site.xml, create the/home/lyh/hadoop_tmp directory under the home directory, and then add the following in Core-site.xml.
sudo mkdir/home/lyh/hadoop_tmp
sudo vim/usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<!--Specify the communication address of the HDFs boss (Namenode)--
<property>
<name> fs.defaultfs</name>
<value>hdfs://localhost:9000</value>
</property>
<! --Specify the storage directory where the Hadoop runtime generates files--
<property>
<name>hadoop.tmp.dir</name>
<value >/home/lyh/hadoop_tmp</value>
</property>
</configuration>
Configure Hdfs-site.xml
sudo vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<!--Specify the number of HDFs replicas--
<property>
<name>dfs.replication< /name>
<value>1</value>
</property>
</configuration>
Configure Yarn-site.xml
sudo vim/usr/local/hadoop/etc/hadoop/yarn-site.xml
<configuration> <!--Site specific YARN Configuration Properties--<property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.c Lass</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property > <property> <name>yarn.resourcemanager.address</name> <va lue>127.0.0.1:8032</value> </property> <property> <name>yarn.re Sourcemanager.scheduler.address</name> <value>127.0.0.1:8030</value> </proper
Ty> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value&gT;127.0.0.1:8031</value> </property> </configuration>
(4) Shut down the system to restart.
Iv. test whether Hadoop is installed and configured successfully.
(1) Verify that Hadoop standalone mode installation is complete
Hadoop version
The ability to display the version number of Hadoop indicates that the standalone mode has been configured.
(2) Start HDFS usage as distribution mode.
Formatting Namenode
HDFs Namenode-format
There are "... has been successfully formatted" and so on appear that the format is successful. Note: Each format will generate a namenode corresponding ID, after multiple formatting, if the Datanode corresponding ID number is not changed, run WordCount will fail to upload the file to input.
Start HDFs
start-all.sh
Show process
JPs
Enter http://localhost:50070/in the browser, the following page appears
Enter http://localhost:8088/, the following page appears
Indicates that the pseudo-distribution installation configuration was successful.
Stop HDFs
stop-all.sh
V. Operation of WordCount
(1) Start HDFs.
start-all.sh
(2) View the directory of files contained under HDFs
Hadoop Dfs-ls/
If you run HDFs for the first time, nothing will be displayed.
(3) Create a file directory input in HDFs and upload the/usr/local/hadoop/readme.txt to input.
HDFs dfs-mkdir/input
Hadoop fs-put/usr/local/hadoop/readme.txt/input
(4) Execute the following command to run the WordCount and output the results to outputs.
Hadoop Jar/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar Wordcount/input/output
A page similar to the above shows that the WordCount ran successfully. Note: Replace the contents of the red wireframe in the diagram with the path information for your Hadoop-mapreduce-examples-2.7.3.jar file.
(5) After successful execution, the output directory will generate two files _success the successful flag, there is no content. One is part-r-00000, see the results of the execution by following the command, as shown below.
Hadoop fs-cat/output/part-r-00000
Attached: HDFs common commands
Hadoop fs-mkdir/tmp/input new folder on HDFs
hadoop fs-put input1.txt/tmp/input upload local file input1.txt to HDFs/tmp/ Input directory
Hadoop fs-get input1.txt/tmp/input/input1.txt pull HDFs files to local
Hadoop fs-ls/tmp/output List A directory for HDFs
Hadoop fs-cat/tmp/ouput/output1.txt view files on HDFs
Hadoop fs-rmr/home/less/hadoop/tmp/output Delete the directory on HDFs
hadoop dfsadmin-report View HDFs status, such as which datanode, each datanode situation
Hadoop Dfsadmin-safemode Leave leave safe mode
Hadoop dfsadmin-safemode enter Safe mode
Reference blog:
1.http://www.cnblogs.com/tatungzhang/p/5993138.html
2.http://blog.csdn.net/windghoul/article/details/52655032