System: Ubuntu14.04
Hadoop version: 2.7.2
Learn to run the first Hadoop program by referencing share in http://www.cnblogs.com/taichu/p/5264185.html.
Create the input folder under the installation folder/usr/local/hadoop of Hadoop
[Email protected]:/usr/local/hadoop$ mkdir./input
Then copy several documents into the input folder as WordCount
[Email protected]:/usr/local/hadoop$ cp *.txt./input
Executive WordCount
[Email protected]:/usr/local/hadoop$ hadoop jar share/hadoop/mapreduce/sources/ Hadoop-mapreduce-examples-2.7.2-sources.jar Org.apache.hadoop.examples.WordCount Input/output
Run times wrong
Exception in thread ' main ' org.apache.hadoop.mapreduce.lib.input.InvalidInputException:Input path does not Exist:hdfs ://hadoopmaster:9000/user/hadoop/input
Put the Usr/local/hadoop/input folder into HDFs as provided in the http://rangerwolf.iteye.com/blog/1900615 solution
[Email protected]:/usr/local/hadoop$ hdfs dfs-put./input/input (here, add/No error in front of input)
Error: Put: ' input ': No such file or directory
The same problem arises with direct mkdir.
[Email protected]:~$ hdfs dfs-mkdir input
mkdir: ' input ': No such file or directory
Plus/Successful put after the execution of WordCount, still error, Input Path does not exist:hdfs://hadoopmaster:9000/user/hadoop/input
Here the Input folder path is/user/hadoop/input, but my path is/usr/local/input, is not this reason cause cannot find the path
Refer to the answer in Http://stackoverflow.com/questions/20821584/hadoop-2-2-installation-no-such-file-or-directory, create in HDFs/ user/hadoop/
[email protected]:hdfs dfs -mkdir -p /user/hadoop
Add the input folder to the folder (this step is done in eclipse)
Now my HDFS file structure is like this
[Email protected]:/usr/local/hadoop/etc/hadoop$ hdfs dfs-ls-r/
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:39/input
-rw-r--r--2 hadoop supergroup 15429 2016-03-16 21:39/input/license.txt
-rw-r--r--2 Hadoop supergroup 101 2016-03-16 21:39/input/notice.txt
-rw-r--r--2 hadoop supergroup 1366 2016-03-16 21:39/input/readme.txt
DRWX-------hadoop supergroup 0 2016-03-16 21:17/tmp
DRWX-------hadoop supergroup 0 2016-03-16 21:17/tmp/hadoop-yarn
DRWX-------hadoop supergroup 0 2016-03-16 21:17/tmp/hadoop-yarn/staging
DRWX-------hadoop supergroup 0 2016-03-16 21:17/tmp/hadoop-yarn/staging/hadoop
DRWX-------hadoop supergroup 0 2016-03-16 21:41/tmp/hadoop-yarn/staging/hadoop/.staging
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:51/user
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 22:02/user/hadoop
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:57/user/hadoop/input
-rw-r--r--3 Hadoop supergroup 15429 2016-03-16 21:57/user/hadoop/input/license.txt
-rw-r--r--3 Hadoop supergroup 101 2016-03-16 21:57/user/hadoop/input/notice.txt
-rw-r--r--3 Hadoop supergroup 1366 2016-03-16 21:57/user/hadoop/input/readme.txt
Execution
[Email protected]:/usr/local/hadoop/etc/hadoop$ hadoop jar share/hadoop/mapreduce/sources/ Hadoop-mapreduce-examples-2.7.2-sources.jar Org.apache.hadoop.examples.WordCount Input/output
Not a valid JAR:/usr/local/hadoop/etc/hadoop/share/hadoop/mapreduce/sources/ Hadoop-mapreduce-examples-2.7.2-sources.jar
Replacing the HDFs jar will
[Email protected]:/usr/local/hadoop/etc/hadoop$ hdfs jar share/hadoop/mapreduce/sources/ Hadoop-mapreduce-examples-2.7.2-sources.jar Org.apache.hadoop.examples.WordCount Input/output
Error:could not find or load main class jar
It is unclear why, another day to solve. Some experts know why the issue of such a problem please advise.
Execute wordcount in eclipse, run successfully! The following are some of the output results.
16/03/16 22:02:46 INFO MapReduce. Job:map 100% Reduce 100%
16/03/16 22:02:46 INFO MapReduce. Job:job JOB_LOCAL1837130715_0001 completed successfully
16/03/16 22:02:46 INFO MapReduce. Job:counters:35
File System Counters
File:number of bytes read=29752
File:number of bytes written=1200391
File:number of Read operations=0
File:number of Large Read operations=0
File:number of Write Operations=0
Hdfs:number of bytes read=66016
Hdfs:number of bytes written=8983
Hdfs:number of Read operations=33
Hdfs:number of Large Read operations=0
Hdfs:number of Write Operations=6
Map-reduce Framework
Map input records=322
Map Output records=2347
Map Output bytes=24935
Map output materialized bytes=13001
Input Split bytes=355
Combine input records=2347
Combine Output records=897
Reduce input groups=840
Reduce Shuffle bytes=13001
Reduce input records=897
Reduce Output records=840
Spilled records=1794
Shuffled Maps =3
Failed shuffles=0
Merged Map Outputs=3
GC time Elapsed (ms) =17
Total committed heap usage (bytes) =1444937728
Shuffle Errors
Bad_id=0
Connection=0
Io_error=0
Wrong_length=0
Wrong_map=0
Wrong_reduce=0
File Input Format Counters
Bytes read=16896
File Output Format Counters
Bytes written=8983
Well, not very good at the moment to understand the results, tomorrow began to read, the basis is the most important.
Memo:
Hadoop command:
Lists directories and files under the root directory of the HDFs file system
HDFs Dfs-ls/ (with/)
HDFs dfs-ls (no/)
These two results are not the same, do not understand why, this is why tomorrow to read the reason
Error:could not find or load main class jar
[Email protected]:/usr/local/hadoop/etc/hadoop$ hdfs dfs-ls ( 在hdfs dfs -mkdir -p /user/[current login user]
)
Found 2 Items
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:57 input
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 22:02 output
[Email protected]:/usr/local/hadoop/etc/hadoop$ hdfs dfs-ls/
Found 3 Items
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:39/input
DRWX-------hadoop supergroup 0 2016-03-16 21:17/tmp
Drwxr-xr-x-hadoop supergroup 0 2016-03-16 21:51/user
List all directories and files of the HDFs file system
HDFs Dfs-ls-r/(There are/and no results are also different, corresponding to the folder above-ls)
Run the first Hadoop program, WordCount