The second day of contact with hadoop, it took two days to configure hadoop to the environment. I wrote my own configuration process here, hoping to help you!
I have shared all the resources used in this article here. Click here to download them. You don't need to find them one by one!
This includes the "Hadoop technology insider" book. The first chapter describes the configuration process, but it is not detailed ~
--------------- Install jdk -------------------------------
1.
Download jdk1.6.0 _ 45
2. decompress the package to the opt folder, configure/etc/profile, and add
# Set java environment
JAVA_HOME =/opt/jdk1.6.0 _ 45
Export JRE_HOME =/opt/jdk1.6.0 _ 45/jre
Export CLASSPATH = $ JAVA_HOME/lib: $ JRE_HOME/lib: $ CLASSPATH
Export PATH = $ JAVA_HOME/bin: $ JRE_HOME/bin: $ PATH
3. Use source/etc/profile to re-execute the modified profile)
4. Configure the default program
Update-alternatives -- install/usr/bin/java/opt/jdk1.6.0 _ 45/bin/java 300
Update-alternatives -- install/usr/bin/java/opt/jdk1.6.0 _ 45/bin/javac 300
Update-alternatives -- install/usr/bin/java/opt/jdk1.6.0 _ 45/bin/jar 300
Update-alternatives -- install/usr/bin/java/opt/jdk1.6.0 _ 45/bin/javah 300
Update-alternatives -- install/usr/bin/java/opt/jdk1.6.0 _ 45/bin/javap 300
Run the following code to install jdk:
Update-alternatives -- config java
5. You can use java-version to view the java version.
--------------- Install eclipse -------------------------------
1. Download the java version from the official website to eclipse
Http://mirror.neu.edu.cn/eclipse/technology/epp/downloads/release/kepler/SR2/eclipse-java-kepler-SR2-linux-gtk.tar.gz
2. decompress the package to the/home/simon folder.
3. Use vi to create a shell script named eclipse
Vi/usr/local/bin/eclipse
The content is as follows:
/Home/simon/eclipse
4. Add the executable permission for the script eclipse: chmod + x/usr/local/bin/eclipse
5. directly input eclipse to start it.
--------------- Install ant -------------------------------
1. Download ant
Http://mirror.esocc.com/apache//ant/binaries/apache-ant-1.9.4-bin.tar.gz
2. decompress and copy to the/home/simon folder.
3. Modify the/etc/profile file
Export ANT_HOME =/home/simon/apache-ant-1.9.4
Export PATH = $ PATH $: $ ANT_HOME/bin
4. Use source/etc/profile to re-execute the modification.
5. Enter ant-version to verify the installation is successful.
Apache Ant (TM) version 1.9.4 compiled on Limit L 29 2014
--------------- Install hadoop -------------------------------
1. Modify the machine name and/etc/hostname to localhost.
2. Configure ssh login without a password
Ssh-keygen-t rsa
Cd ~ /. Ssh
Cat id_rsa.pub> authorized_keys
Apt-get install openssh-server
3. If the command ssh localhost fails, you need to start the ssh service.
Run the following command to start the ssh service:
Service ssh start
/Etc/init. d/ssh start
If the instance fails to be started, restart the instance.
3. Configure hadoop
(1) EDIT conf/hadoop-env.sh and modify the value of JAVA_HOME:
Export JAVA_HOME =/opt/jdk1.6.0 _ 45
(2) EDIT conf/mapred-site.xml to add content:
<Property>
<Name> mapred. job. tracker </name>
<Value> http: // localhost: 9001 </value>
</Property>
(3) EDIT conf/hdfs-site.xml, add content:
<Property>
<Name> dfs. name. dir </name>
<Value>/home/simon/name </value>
</Property>
<Property>
<Name> dfs. data. dir </name>
<Value>/home/simon/data </value>
</Property>
<Property>
<Name> dfs. permissions </name>
<Value> false </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 1 </value>
</Property>
(4) EDIT conf/core-site.xml to add content:
<Property>
<Name> fs. default. name </name>
<Value> hdfs :/// localhost: 9000 </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/home/hadoop/hadoop-1.0.0/tmp </value>
</Property>
(5)
Format hdfs: bin/hadoop namenode-format
Start hadoop: bin/start-all.sh
If no permission is displayed, it may be because the file has no permission, or the file-to-user is not the current user (root)
You can try chmod + x file name
Chown root: root bin /*
----------------- Configure the eclipse plug-in ---------------
1. Copy the hadoop-eclipse-plugin-1.0.0.jar to the eclipse directory to the plugins folder
2. Open eclipse
In the window-showview-other... dialog box, select MapReduce Tools-Map/Reduce Locations.
If the dialog box does not exist, the: % eclispe_dir %/configration/config. ini file contains an org. eclipse. update. reconcile = false configuration. Change it to true and then enter eclipse again.
3. You can see DFS Locations in Project Explorer. If you can open Several folders down, the configuration is successful.
Start eclipse:
Env UBUNTU_MENUPROXY =/home/simon/eclipse start eclipse. Note that there is a space between the equal sign and the eclipse path.
------------------ Run java program --------------------
1. Configure input and output to the path
Right-click -- Run As -- Run Configurations... -- Argument in the program
Enter
Hdfs: // localhost: 9000/test/input hdfs: // localhost: 9000/test/output
Use spaces to separate input and output to the path.
2. Import the jar package to hadoop, right-click the project -- Properties -- select Java Build Path on the left -- select Libraries on the right -- click Add External JARs on the right...
Select the jar package in the hadoop/lib/path. If you do not know which one to choose, select all !~ (Helpless)
3. Right-click the program and choose -- Run As -- Run on hadoop to Run the program.