The master asked me to check hadoop and use the latest version. As a result, many problems were encountered and solved one by one ~
Run the pseudo distribution mode in Linux, and there is always nullpointerexception.
Java. Lang. nullpointerexception
At java. util. Concurrent. concurrenthashmap. Get (concurrenthashmap. Java: 768)
At org. Apache. hadoop. mapred. reducetask $ reducecopier $ getmapeventsthread. getmapcompletionevents (reducetask. Java: 2747)
At org. Apache. hadoop. mapred. reducetask $ reducecopier $ getmapeventsthread. Run (reducetask. Java: 2670)
I checked this article.
Https://issues.apache.org/jira/browse/HADOOP-4744
Https://issues.apache.org/jira/browse/MAPREDUCE-969
Later, the hosts and IP address on the Internet were not correct.
After half a day, I used a simple solution.
Command bin/hadoop namenode
At startup, you will be given information about your host and IP address, and you can simply change the localhost in the configuration file to the IP address above.
The cluster mode has not been tried yet, because there are other problems in windows ......
Run cygwin in Windows:
When you run the wordcount and PI examplesFilenotfoundexception
Java. Io. filenotfoundexception: file C:/tmp/hadoop-system/mapred/local/tasktracker/jobcache/job_201005040912_0002/logs/work/tmp does not exist.
At org. Apache. hadoop. fs. rawlocalfilesystem. getfilestatus (rawlocalfilesystem. Java: 420)
Online conclusion
There is a URL that illustrates the problem:
$ {Mapred. Local. dir}/tasktracker/jobcache/$ jobid/$ taskid/work/tmp: temporary directory of the task. (You can set the mapred. Child. tmp attribute to set a temporary directory for map and reduce tasks. The default value is./tmp. If this value is not an absolute path, it will add the job path of the task to the front of the path as the temporary file path of the task. If this value is an absolute path, use this value directly. If the specified directory does not exist, the directory is automatically created. Then, run the Java subtask according to the option-djava. Io. tmpdir = 'absolute path of the temporary file. The temporary file paths of pipes and streaming are set through the environmental variable tmpdir = 'the absolute path of the TMP dir ). If mapred. Child. tmp has a./tmp value, this directory will be created.
However, you must set the path in cygwin.
In fact, the most important thing is to create the folder by yourself. Otherwise, there will still be problems. It seems that the absolute path is required (otherwise, the folder cannot be created ), solve the problem after creating the absolute path and folder
Digress, aboutNamenode format. The reformat prompt is displayed when the format is changed again. At this time, the case sensitivity is very important. The case y cannot be used, and the case must be capitalized. However, it seems that even if hadoop is started in uppercase, the problem may occur, in tasktracker logs, an exception record occurs, that is, the ID does not match. The simplest way is to delete the folder and re-format it ..
Unsolved Problems:
When running PI estimation examplesIllegalargumentexception:
Java. Lang. illegalargumentexception: N must be positive
At java. util. Random. nextint (random. Java: 250)
At org. Apache. hadoop. fs. localdirallocator $ allocatorpercontext. confchanged (localdirallocator. Java: 243)
At org. Apache. hadoop. fs. localdirallocator $ allocatorpercontext. getlocalpathforwrite (localdirallocator. Java: 289)
At org. Apache. hadoop. fs. localdirallocator. getlocalpathforwrite (localdirallocator. Java: 124)
At org. Apache. hadoop. mapred. mapoutputfile. getspillfileforwrite (mapoutputfile. Java: 107)
At org. Apache. hadoop. mapred. maptask $ mapoutputbuffer. sortandspill (maptask. Java: 1221)
At org. Apache. hadoop. mapred. maptask $ mapoutputbuffer. Flush (maptask. Java: 1129)
At org. Apache. hadoop. mapred. maptask $ newoutputcollector. Close (maptask. Java: 549)
At org. Apache. hadoop. mapred. maptask. runnewmapper (maptask. Java: 623)
At org. Apache. hadoop. mapred. maptask. Run (maptask. Java: 305)
At org. Apache. hadoop. mapred. Child. Main (child. Java: 159)
The cause of this address problem is described in detail: https://issues.apache.org/jira/browse/HADOOP-6766
I will not write it again here. Will anyone tell me the solution? Thank you ~
It appears when you run wordcount.Classcastexception
2010-08-01 16:35:38, 823 info Org. apache. hadoop. mapred. taskinprogress: Error from attempt_2010080000034_0000000m_0000000000: Java. lang. classcastexception: Org. apache. hadoop. mapreduce. lib. input. filesplit cannot be cast to Org. apache. hadoop. mapred. inputsplit
At org. Apache. hadoop. mapred. maptask. runoldmapper (maptask. Java: 323)
At org. Apache. hadoop. mapred. maptask. Run (maptask. Java: 307)
At org. Apache. hadoop. mapred. Child. Main (child. Java: 170)
Problem:
Https://issues.apache.org/jira/browse/HADOOP-5576