Configure Hadoop
JDK and SSH have been configured as prerequisites
(How to configure jdk:http://www.cnblogs.com/xxx0624/p/4164744.html)
(How to configure ssh:http://www.cnblogs.com/xxx0624/p/4165252.html)
1. Add a Hadoop user
sudo addgroup hadoop sudo adduser--ingroup hadoop Hadoop
2. Download the Hadoop file (example: Hadoop1.2.1, I put the/home/xxx0624/hadoop)
sudo tar-zxzf hadoop-1.2.1.tar.gz sudo mv Hadoop-1.2.1/home/xxx0624/hadoop
Ensure that all operations are done under Hadoop users
sudo chown-r hadoop:hadoop/home/xxx0624/hadoop
3. Setting up Hadoop and Java environment variables
sudo gedit/home/xxx0624/hadoop/conf/hadoop-env.sh
At the end of the open file, add:
Export JAVA_HOME=/USR/LIB/JVM
Make the environment variable effective (each time you run the Hadoop command you must ensure that the variable takes effect!) )
source/home/xxx0624/hadoop/conf/hadoop-env.sh
4. Pseudo-Distributed mode configuration
Core-site.xml:hadoop core configuration items, such as HDFs and MapReduce common I/O settings.
Configuration items for the Hdfs-site.xml:hadoop daemon, including Namenode, auxiliary namenode, and Datanode.
Configuration items for the Mapred-site.xml:mapreduce daemon, including Jobtracker and Tasktracker.
4.1 First create these folders first
mkdir tmp mkdir hdfs mkdir hdfs/name mkdir hdfs/data/* are all under the Hadoop folder */
4.2 Start editing files
Core-site.xml:
1 <Configuration>2 < Property>3 <name>Fs.default.name</name>4 <value>hdfs://localhost:9000</value>5 </ Property>6 < Property>7 <name>Hadoop.tmp.dir</name>8 <value>/home/xxx0624/hadoop/tmp</value>9 </ Property>
Hdfs-site.xml:
1 <Configuration>2 < Property>3 <name>Dfs.replication</name>4 <value>1</value>5 </ Property>6 < Property>7 <name>Dfs.name.dir</name>8 <value>/home/xxx0624/hadoop/hdfs/name</value>9 </ Property>Ten < Property> One <name>Dfs.data.dir</name> A <value>/home/xxx0624/hadoop/hdfs/data</value> - </ Property> - </Configuration
Mapred-site.xml:
1 <Configuration>2 < Property>3 <name>Mapred.job.tracker</name>4 <value>localhost:9001</value>5 </ Property>6 </Configuration>
5. Format HDFs
If this error occurs:
ERROR Namenode. NameNode:java.io.IOException:Cannot Create Directory/home/xxx0624/hadoop/hdfs/name/current
Then: Set the directory permissions for Hadoop to the current user writable sudo chmod-r a+w/home/xxx0624/hadoop, granting write access to the Hadoop directory
6. Start Hadoop
cd/home/xxx0624/hadoop/binstart-all.sh
The correct results are as follows:
Warning: $HADOOP _home is deprecated.
Starting Namenode, logging to/home/xxx0624/hadoop/logs/hadoop-xxx0624-namenode-xxx0624-thinkpad-edge.out
Localhost:warning: $HADOOP _home is deprecated.
localhost
Localhost:starting Datanode, logging to/home/xxx0624/hadoop/logs/ Hadoop-xxx0624-datanode-xxx0624-thinkpad-edge.out
Localhost:warning: $HADOOP _home is deprecated.
localhost
Localhost:starting Secondarynamenode, logging to/home/xxx0624/hadoop/logs/ Hadoop-xxx0624-secondarynamenode-xxx0624-thinkpad-edge.out
Starting Jobtracker, logging to/home/xxx0624/hadoop/logs/hadoop-xxx0624-jobtracker-xxx0624-thinkpad-edge.out
Localhost:warning: $HADOOP _home is deprecated.
localhost
Localhost:starting Tasktracker, logging to/home/xxx0624/hadoop/logs/ Hadoop-xxx0624-tasktracker-xxx0624-thinkpad-edge.out
The JPS command can be used to verify success:
If 5 daemons are present, the normal
7. View the running status
HTTP://LOCALHOST:50030/-Hadoop Management interface
http://localhost:50060/-Hadoop Task Tracker status
http://localhost:50070/-Hadoop DFS status
8. Turn off Hadoop
stop-all.sh
Manual Hadoop Configuration in Ubuntu environment