New configuration problem in Mahout algorithm

Source: Internet
Author: User
Tags config

Version: hadoop2.4+mahout0.9

When you invoke the Cloud Platform Mahout algorithm in a Web program, you sometimes encounter problems where you cannot find a path, such as the Org.apache.mahout.clustering.classify.ClusterClassifier in the class

public void Readfromseqfiles (Configuration conf, path Path) throws IOException {  
    Configuration config = new Configurat Ion ();  
    list<cluster> clusters = lists.newarraylist ();  
    For (clusterwritable cw:new sequencefiledirvalueiterable<clusterwritable> (Path, pathtype.list,  
        Pathfilters.logscrcfilter (), config)) {  
      Cluster Cluster = Cw.getvalue ();  
      Cluster.configure (conf);  
      Clusters.add (cluster);  
    }  
    This.models = clusters;  
    Modelclass = Models.get (0). GetClass (). GetName ();  
    This.policy = readpolicy (path);  
  }

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/sjjg/

This method, which is used in the Setup function in Cimapper. If you use the Web to invoke the K-means algorithm, then running here will be an error, because it can not find the path, read the center point when the path is not found, this is because in the method Readfromseqfiles read the configuration is new out. The incoming path is/path/to/center, not hdfs://host:port/path/to/center.

If you encounter this problem when you submit a job task on the web (or use main direct submission in Windows Eclipse), you can read that path if you commit it in the terminal (the Namenode node is not tested by the other node).

Workaround:

1. Fixed cluster: In Configuration config= new Configuration (); followed by Conf.set () to set up the cluster.

This is also necessary to modify the source code, and if the cluster changes, it is necessary to recompile this class, and upload to the cluster nodes.

2. Configuration into this method, like the above Readfromseqfiles method, but the method inside a configuration, but it is new one, do not understand the Mahout source code why is this.

This approach requires modifying the invocation of the class that calls this method, but if the cluster changes, it is not necessary to recompile and package the classes to upload to each node of the cluster, just set up the cluster when the job is submitted.

Author: csdn Blog fansy1990

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.