I. INTRODUCTION
Refer to many tutorials on the web, and eventually install Hadoop in the ubuntu14.04 configuration successfully. The detailed installation steps are described below. The environment I use: two Ubuntu 14.04 64-bit desktops, Hadoop chooses the 2.7.1 version.
Two. Prepare for work 2.1 Create a user
To create a user and add root permissions to it, it is better to personally verify the following method.
1 sudo AddUser Hadoop 2 sudo vim/etc/sudoers3# Modify the contents as follows:4 root all = (All)all5 Hadoop all = (all) all
Create a directory for Hadoop users and add it to the Sudo user group with the following command:
1 sudo mkdir /home/hadoop2sudochown hadoop/home/hadoop3 # Add to sudo user group 4 sudo sudo
Finally log off the current user and log in with the newly created Hadoop user.
2.2 Installing the SSH service
There is no SSH server installed by default in Ubuntu (SSH client only), so run the following command to install Openssh-server first. Easy and pleasant installation process ~
sudo Install ssh Openssh-server
2.3 Configuring SSH Login without password
Directly on the code: after the execution of the following code can be directly logged in (you can run SSH localhost for verification)
1 cd ~/. SSH 2 Ssh-keygen -t RSA3cp id_rsa.pub Authorized_keys
Three. Installation process 3.1 Download the Hadoop installation package
There are two ways to download:
1. Go directly to the official website to download:
Http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz
2. Use the wget command to download:
wget http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz
3.2 Configuring Hadoop
1. Unzip the downloaded Hadoop installation package and modify the configuration file. My unzip directory is (/home/hadoop/hadoop-2.7.1), that is, go to the/home/hadoop/folder to perform the following decompression command.
tar -zxvf hadoop-2.7. 1. tar. gz
2. Modify the configuration file: (hadoop2.7.1/etc/hadoop/) directory, Hadoop-env.sh,core-site.xml,mapred-site.xml.template,hdfs-site.xml.
(1). Core-site.xml Configuration: The path of Hadoop.tmp.dir can be set according to your own habits.
<Configuration>< Property><name>Hadoop.tmp.dir</name><value>File:/home/hadoop/hadoop/tmp</value><Description>Abase for other temporary directories.</Description></ Property>< Property><name>Fs.defaultfs</name><value>hdfs://localhost:9000</value></ Property></Configuration>
(2). Mapred-site.xml.template Configuration:
<Configuration>< Property><name>Mapred.job.tracker</name><value>localhost:9001</value></ Property></Configuration>
(3). Hdfs-site.xml configuration: Where the paths of Dfs.namenode.name.dir and Dfs.datanode.data.dir can be set freely, preferably under the Hadoop.tmp.dir directory.
Note: If you find that the JDK is not found when you run Hadoop, you can place the JDK's path directly inside the hadoop-env.sh, as follows:
Export Java_home= "/opt/java_file/jdk1.7.0_79", which is the path when JAVA is installed.
<Configuration>< Property><name>Dfs.replication</name><value>1</value></ Property>< Property><name>Dfs.namenode.name.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/name</value></ Property>< Property><name>Dfs.datanode.data.dir</name><value>File:/home/hadoop/hadoop/tmp/dfs/data</value></ Property></Configuration>
Run Hadoop after the configuration is complete.
Four. Run hadoop4.1 to initialize the HDFS system
Execute the command in the hadop2.7.1 directory:
Bin/hdfs Namenode-format
The following results show that the initialization was successful.
4.2 Open
NameNode
And
DataNode
Daemon process
Execute the command in the hadop2.7.1 directory:
sbin/start-dfs.sh
Success is as follows:
4.3 Use the JPS command to view process information:
If the result appears, both Datanode and Namenode are turned on.
4.4 Viewing the Web interface
Enter http://localhost:50070 in the browser to view relevant information, as follows
At this point, the Hadoop environment has been built.
Five. Run WordCount Demo
1. Create a new file locally and fill in the contents: for example, I created a new haha.txt file in the Home/hadoop directory with the content "Hello world! "。
2. Then create a new test folder in the Distributed File System (HDFS) to upload our test file haha.txt. Run the command in the hadoop-2.7.1 directory:
# in the root directory of HDFs built a test directory bin/hdfs DFS-mkdir /test# View the directory structure under the HDFs root directory bin/hdfs DFS-ls /
The results are as follows:
3. Upload the local haha.txt file to the test directory;
# Upload bin/hdfs dfs-put/home/hadoop/haha.txt/test/# View bin/hdfs dfs-ls /test/
The results are as follows:
4. Run WordCount demo;
# Save the running results in the/test/outdirectory under the bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7. 1. jar wordcount/test/haha.txt/test/out# view files in/test/out directory bin/hdfs DFS-ls /test /out
The results are as follows:
The running result indicates that the operation was successful and the results were saved in part-r-00000.
5. View the results of the operation;
# viewrunning results in part-r-00000 bin/hadoop FS-cat /test/out/part-r-00000
The results are as follows:
At this point, WordCount demo run is finished.
Six. Summary
Configuration process encountered a lot of problems, and finally all resolved, a lot of harvest, hereby put the experience of this configuration to share, convenient for you want to configure the Hadoop environment of friends ~
Reference:
Http://www.tuicool.com/articles/bmeUneM
Linux installation Configuration Hadoop