Version: hadoop2.4+mahout0.9
When you invoke the Cloud Platform Mahout algorithm in a Web program, you sometimes encounter problems where you cannot find a path, such as the Org.apache.mahout.clustering.classify.ClusterClassifier in the class
public void Readfromseqfiles (Configuration conf, path Path) throws IOException {
Configuration config = new Configurat Ion ();
list<cluster> clusters = lists.newarraylist ();
For (clusterwritable cw:new sequencefiledirvalueiterable<clusterwritable> (Path, pathtype.list,
Pathfilters.logscrcfilter (), config)) {
Cluster Cluster = Cw.getvalue ();
Cluster.configure (conf);
Clusters.add (cluster);
}
This.models = clusters;
Modelclass = Models.get (0). GetClass (). GetName ();
This.policy = readpolicy (path);
}
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/sjjg/
This method, which is used in the Setup function in Cimapper. If you use the Web to invoke the K-means algorithm, then running here will be an error, because it can not find the path, read the center point when the path is not found, this is because in the method Readfromseqfiles read the configuration is new out. The incoming path is/path/to/center, not hdfs://host:port/path/to/center.
If you encounter this problem when you submit a job task on the web (or use main direct submission in Windows Eclipse), you can read that path if you commit it in the terminal (the Namenode node is not tested by the other node).
Workaround:
1. Fixed cluster: In Configuration config= new Configuration (); followed by Conf.set () to set up the cluster.
This is also necessary to modify the source code, and if the cluster changes, it is necessary to recompile this class, and upload to the cluster nodes.
2. Configuration into this method, like the above Readfromseqfiles method, but the method inside a configuration, but it is new one, do not understand the Mahout source code why is this.
This approach requires modifying the invocation of the class that calls this method, but if the cluster changes, it is not necessary to recompile and package the classes to upload to each node of the cluster, just set up the cluster when the job is submitted.
Author: csdn Blog fansy1990