In the last few days in a cloud computing game, but also in Ubuntu with the success of Hadoop, but according to the book, Step-by-step. Therefore, the impression is not deep, to their own help is not small. This time take advantage of the opportunity, I practiced two times the configuration process, feeling the harvest is rich, write an article summary.
First of all to give me a lot of guidance on a Web site posted: Xiamen University Database Laboratory blog http://dblab.xmu.edu.cn/blog/. As a database lab, can have the energy and enthusiasm to do such a site, really adore them. In addition, writing an article is another purpose is the online version of the Ubuntu configuration of the Hadoop article is too many, different people have different methods, different version configuration method also have differences, so before also because this took a lot of detours, failed a lot of times. Write this, I really,,, the system is well-equipped, can be used as an experience by others.
Environment:
System: Ubuntu Kylin 14.04LTS version (have to say, compared to the original system, this Chinese custom version of the Unicorn is really good-looking and useful)
Hadoop 2.6.0
Jdk1.8.0_25
Steps:
1. Create a Hadoop user
Create user
sudo useradd Hadoop
Set Password
sudo passwd Hadoop
Create a directory for Hadoop users
Cd/homesudo mkdir Hadoopsudo chown hadoop/home/hadoop
Grant Administrator privileges to Hadoop users so that they are easy to use later
sudo adduser hadoop sudo
Log off the current user, login to the Hadoop user
2. Install SSH
SSH (Secure Shell) is the most reliable protocol to provide security for Telnet sessions and other network services at the moment. The use of SSH protocol can effectively prevent the information leakage in the remote management process. With SSH, all transmitted data can be encrypted, as well as prevent DNS spoofing and IP spoofing.
Installing SSH server
sudo apt-get install Openssh-server
Set SSH login without password
SSH localhost
Have the following hint (SSH first login hint), enter Yes
Configure to login without password:
Enter exit to exit SSH, and then execute the following command
CD./.ssh # If you do not have the directory, first SSH localhostssh-keygen-t RSA # always press ENTER to save the generated key as. SSH/ID_RSACP Id_rsa.pub Authorized _keys
With the ssh localhost command, you can log in directly.
3. Configuring the JDK
First introduced the Xiamen University Database Laboratory Blog Introduction Method:
sudo apt-get install openjdk-7-jre openjdk-7-jdk
This method can install the JDK directly in the terminal, but after the attempt to error,, because the level is not high, not found where the wrong.
The second way is for me to use it. First download the JDK installation package on the Oracle official website, I downloaded the jdk-8u25-linux-x64.tar.gz after decompression is the Jdk1.8.0_25 folder, which exists in the home folder.
Then copy the folder to the/USR/LIB/JVM
sudo cp-r ~/jdk1.8.0_25 /USR/LIB/JVM
then renamed to Java-8u5-sun
sudo mv/usr/lib/jvm/jdk1.8.0_25/usr/lib/jvm/java-8u5-sun
Configure environment variables after
sudo gedit ~/.BASHRC
At the end of the file, add
Export Java_home=/usr/lib/jvm/java-8u5-sun
Export JRE_HOME=${JAVA_HOME}/JRE
Export Classpath=.:${java_home}/lib:${jre_home}/lib
Export Path=${java_home}/bin: $PATH
Where Java_home stores the path to the JAVA installation.
Then, make the above configuration effective
SOURCE ~/.BASHRC
At this point, the configuration of the JDK is basically finished, you can enter Java in the terminal to verify that if the terminal output Java command suffix interpretation, it indicates that the installation was successful
4. Installing Hadoop
First to the Apache official website download Hadoop, I downloaded the Hadoop2.6.0 compressed package name: Hadoop-2.6.0.tar.gz after decompression for hadoop-2.6.0
Copy the folder to/usr/local/hadoop
sudo mv ~/hadoop-2.6.0//usr/local/hadoop # Change file name to Hadoopsudo chown-r hadoop:hadoop/usr/local/hadoop # Modify file permissions
modifying environment variable Information java_home
Cd/usr/local/hadoopgedit etc/hadoop/hadoop-env.sh
will be export JAVA_HOME=${JAVA_HOME} changed to
Enter the following command to see if the Hadoop check is available, and success displays the command line usage
Bin/hadoop
5. Hadoop pseudo-Distribution settings
Modify the configuration file Etc/hadoop/core-site.xml to
<configuration></configuration>
Switch
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/ Usr/local/hadoop/tmp</value> <description>abase for other temporary directories.</description > </property> <property> <name>fs.defaultFS</name> <value> Hdfs://localhost:9000</value> </property></configuration>
Modifying a configuration file Etc/hadoop/hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1< /value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/dfs/name</value> </property> <property> < Name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/dfs/data</value> </property></configuration>
After the configuration is complete, first create the required temp directory in the Hadoop directory:
Cd/usr/local/hadoopmkdir tmp DFS dfs/name dfs/data
Initializing file System HDFs
Bin/hdfs Namenode-format
Open Namenode and other processes
sbin/start-dfs.sh
Enter JPS to check that each node is ready. If the Datanode node is not turned on at this time, re-initialize, restart the start
This is where Hadoop is configured to work.
Next article about compiling and running a Hadoop program in an Ubuntu environment
Hadoop2.6.0 Configuration on Ubuntu Kylin14.04