hadoop1.0.4,mahout0.7.
A recent update on the platform for the previously written web call Mahout algorithm added some basic operations for Hadoop and added Mahout two algorithms, which are expected to be posted to the CSDN resource page in two days. Students who need to download the reference.
Today, the problem is to call the Mahout Kmeans algorithm in the MyEclipse, first look at the error information:
2014-01-01 22:34:37,611 INFO [main] (jobclient.java:1330)-Task Id:attempt_201401011932_0023_m_000000_0, Status:fai
LED java.lang.indexoutofboundsexception:index:0, size:0 at Java.util.ArrayList.rangeCheck (arraylist.java:604) At Java.util.ArrayList.get (arraylist.java:382) at Org.apache.mahout.clustering.classify.ClusterClassifier.readF Romseqfiles (clusterclassifier.java:215) at Org.apache.mahout.clustering.iterator.CIMapper.setup (cimapper.java:36 ) at Org.apache.hadoop.mapreduce.Mapper.run (mapper.java:142) at Org.apache.hadoop.mapred.MapTask.runNewMapper ( maptask.java:764) at Org.apache.hadoop.mapred.MapTask.run (maptask.java:370) at org.apache.hadoop.mapred.child$ 4.run (child.java:255) at Javax.security.auth.Sub, Java.security.AccessController.doPrivileged (Native method) Ject.doas (subject.java:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java : 1093) at org.apache.hadoop.mApred. Child.main (child.java:249)
This can be regarded as a mahout of a bug bar (amount, also do not know calculate). It is in the clusterclasifier when the configuration, it will go to new one, this will cause what problems? If you are running on the command line (Linux), there is no problem, if you are running in the Win7 MyEclipse, there will be a problem because we have specified two parameters for conf: Mapred.job.tracker and Fs.default.name. We are relying on these two parameters to find our cluster, if these two parameters are not used, then the cluster will certainly have problems (unable to find the cluster). So in Clusterclassifier's 230 write can not write, but there is no error (write it also did not error, really). Then it reads it while reading it, but there is no file, so it causes the readout to be null, so size is 0, and if the code goes subscript 0, then the array must be out of bounds.
Here's how to solve it: there was a problem before, it was also found in the code after the new one configuration caused an error, so the practice was followed by the Conf.set ("Mapred.job.tracker", " Fansypc:9001 "); Conf.set ("Fs.default.name", "fansypc:9000"); then it's OK, but it's troublesome. Is there a better way?
Well, there must be some. But do not know the 0.9 version of the problem has not been modified, if the use of 0.7 of students, you can download 0.7 of the configuration source, and then copied to the project (the path is consistent). Then modify the 213 lines of the construction method, the Conf.set ("Mapred.job.tracker", "fansypc:9001"); Conf.set ("Fs.default.name", "fansypc:9000"), added after this (true). And then you can, if you're not sure if it's OK, then what can you do? Direct new configuration, then Print Conf.get ("Mapred.job.tracker") if there is a value, then OK.
After the modification, run Kmeansdriver again and run successfully.
This column more highlights: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/