First, the system environment
Os:ubuntu 16.04 LTS 64bitjava:1.8.0_161hadoop:2.6.4
Second, installation steps
1. Install and configure SSH
1.1 Installing SSH
Enter the command: $ sudo apt-get install Openssh-server , after the installation is complete, use the command $ ssh localhost to log on to this machine. The first login will be prompted, enter Yes, and then enter the current user login to the computer password.
1.2 Configuring SSH Login without password
First use the command $ exit to exit the previous step of SSH, then use Ssh-keygen to generate the key, and finally add the key to the authorization, the command is as follows:
$ exit # exit just ssh localhost$ CD ~/.ssh/ # If there is no such directory, please first execute SSH localhost$ ssh-keygen-t RSA # will be prompted, all press ENTER can be $ cat. /id_rsa.pub >>./authorized_keys # Join license
After the configuration is complete, use ssh localhost to log in without a password.
2. Install and configure Java
2.1 Installing Java
To download jdk-8u161-linux-x64.tar.gz on the official website, unzip and install to the/usr/local/directory using the following command:
$ cd ~/download $ sudo tar-xzf jdk-8u161-linux-x64.tar.gz-c/usr/local $ sudo mv Jdk1.8.0_161/java
2.2 Configuring Environment variables
Using the command $ vim ~/.BASHRC to edit the file ~/.BASHRC, add the following at the beginning of the file:
Export Java_home=/usr/local/javaexport jre_home= $JAVA _home/jreexport classpath=.: $JAVA _home/lib: $JRE _home/ Libexport path= $PATH: $JAVA _home/bin
Finally, use the command $ source ~/.BASHRC to make the environment variable effective. Use Java-version to check that the configuration is correct and configured correctly such as:
3. Installation and configuration of Hadoop
3.1 Hadoop Download and installation
To download hadoop-2.6.4.tar.gz on the Hadoop website, use the following command to install to the/usr/local/directory:
sudo tar -xzf hadoop-2.6. 4. tar. gz-c/usr//usr/sudomv hadoop-2.6. 4/sudochown -R hadoop./hadoop #前一个hadoop为用户名, change to your username
Add the following code to the ~/.BASHRC:
Export hadoop_home=/usr/local/hadoopexport hadoop_conf_dir= $HADOOP _home/etc/hadoopexport yarn_conf_ DIR= $HADOOP _home/etc/hadoopexport PATH= $PATH: $HADOOP _home/bin
3.2 Hadoop stand-alone configuration
The installed Hadoop defaults to stand-alone configuration, which can be run without additional configuration. Experience the following with examples of Word statistics from Hadoop:
$ cd/usr/local/mkdir ./cp ./etc/hadoop/*. Xml./input # Use the configuration file as the input file $./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep./input./output ' dfs[a-z .] + ' $ cat./output/* # View Run results
The result is:
1 Dfsadmin
3.3 Hadoop pseudo-distributed configuration
Hadoop configuration files are stored under/usr/local/hadoop/etc/hadoop, to modify the file Core-site.xml and hdfs-site.xml in this directory to achieve a pseudo-distributed configuration.
Modify the Core-site.xml to change <configure></configure> to:
<Configuration> < Property> <name>Hadoop.tmp.dir</name> <value>File:/usr/local/hadoop/tmp</value> <Description>Abase for other temporary directories.</Description> </ Property> < Property> <name>Fs.defaultfs</name> <value>hdfs://localhost:9000</value> </ Property></Configuration>
Modify the Hdfs-site.xml to change <configure></configure> to:
<Configuration> < Property> <name>Dfs.replication</name> <value>1</value> </ Property> < Property> <name>Dfs.namenode.name.dir</name> <value>File:/usr/local/hadoop/tmp/dfs/name</value> </ Property> < Property> <name>Dfs.datanode.data.dir</name> <value>File:/usr/local/hadoop/tmp/dfs/data</value> </ Property></Configuration>
After the configuration is complete, use the command under/usr/local/hadoop $./bin/hdfs Namenode-format To achieve namenode formatting, after success there will be "successfully formatted" and " Exiting with status 0 "prompt, such as:
Then use sbin/start-dfs.sh to turn on Namenode and Datanode, and then use the command JPS to see if it's turned on successfully, such as:
Namenode and Datanode have to appear before they are successful.
4. Configure yarn (not required)
$ CP etc/hadoop/mapred-site.xml.template Etc/hadoop/mapred-site.xml
Then modify the Etc/hadoop/mapred-site.xml
<Configuration> < Property> <name>Mapreduce.framework.name</name> <value>Yarn</value> </ Property></Configuration>
Modify Etc/hadoop/yarn-site.xml
<Configuration> < Property> <name>Yarn.nodemanager.aux-services</name> <value>Mapreduce_shuffle</value> </ Property></Configuration>
Start the Explorer
$./sbin/start-yarn.sh$./sbin/mr-jobhistory-daemon.sh start Historyserver #查看历史任务
After successful startup, you can access the cluster resource manager in Http://localhost:8088/cluster.
Close Explorer
$./sbin/stop-yarn.sh$./sbin/mr-jobhistory-daemon.sh Stop Historyserver
Third, reference
1, http://dblab.xmu.edu.cn/blog/install-hadoop/
2, https://sjq597.github.io/2016/07/19/Ubuntu-16-04-Hadoop%E6%9C%AC%E5%9C%B0%E5%AE%89%E8%A3%85%E9%85%8D%E7%BD%AE/
Local installation and configuration of Hadoop under Ubuntu16.04