hadoop1.0.4,mahout0.5.
Mahout inside the implementation of the read clustering algorithm, called Clusterdumper, this class output format is generally as follows:
Vl-2{n=6 c=[1.833, 2.417] r=[0.687, 0.344]}
Weight: Point :
1.0: [1.000, 3.000]
...
1.0: [3.000, 2.500]
vl-11{n=7 c=[2.857, 4.714] r=[0.990, 0.364]}
Weight: Point :
1.0: [1.000, 5.000]
...
1.0: [4.000, 4.500]
vl-14{n=8 c=[4.750, 3.438] r=[0.433, 0.682]}
Weight: Point :
1.0: [4.000, 3.000 ]
...
1.0: [5.000, 4.000]
However, if I only want to achieve the output of the cluster center file, then not. Originally want to inherit clusterdumper, results Clusterdumper is a final, forget, or write it yourself.
Back to the column page: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/extra/
Refer to the source code in Clusterdumper, as follows:
For (Cluster value:
new Path (Seqfiledir, "part-*"), sequencefiledirvalueiterable<cluster> Pathtype.glob, conf)) {
String fmtstr = value.asformatstring (dictionary);
if (subString > 0 && fmtstr.length () > subString) {
writer.write (': ');
Writer.write (fmtstr, 0, Math.min (subString, Fmtstr.length ()));
else {
writer.write (FMTSTR);
}
or refer to the LZ before an article: Mahout source Kmeansdriver analysis of the second center point file Analysis (no text), there are also about the cluster center reading;
You can write a Clustercenterdump class, as follows:
Package com.caic.cloud.util;
Import Java.io.File;
Import java.io.FileNotFoundException;
Import java.io.IOException;
Import Java.io.Writer;
Import Org.apache.commons.logging.Log;
Import Org.apache.commons.logging.LogFactory;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import Org.apache.mahout.clustering.Cluster;
Import Org.apache.mahout.common.iterator.sequencefile.PathType;
Import org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterable;
Import Com.google.common.base.Charsets;
Import Com.google.common.io.Files; /** * Just output the center vector to a given file * @author fansy * * */public class Clustercenterdump {PR
Ivate Log Log=logfactory.getlog (clustercenterdump.class);
Private Configuration conf;
Private Path Centerpathdir;
Private String OutputPath; /*public clustercenterdump () {} public Clustercenterdump (Configuration conf) {
this.conf=conf; }*/public Clustercenterdump (Configuration conf,string centerpathdir,string outputpath) {THIS.C
onf=conf;
This.centerpathdir=new Path (Centerpathdir);
This.setoutputpath (OutputPath); /** * Write the given cluster center to the given file * @return * @throws Filenotfou Ndexception */public Boolean writecentertolocal () throws filenotfoundexception{if (this.conf==null || this.outputpath==null| |
This.centerpathdir==null) {log.info ("Error:\nshould initial the configuration, OutputPath and Centerpath");
return false;
} Writer Writer=null;
Try {file Outputfile=new file (OutputPath);
writer = Files.newwriter (outputfile, charsets.utf_8); This.writetxtcenter (writer, New Sequencefiledirvalueiterable<cluster> ( NeW Path (Centerpathdir, "part-*"), Pathtype.glob, conf);
New Sequencefiledirvalueiterable<writable> (New Path (Centerpathdir, "part-r-00000"), Pathtype.list,
Pathfilters.partfilter (), Conf));
Writer.flush ();
catch (IOException e) {log.info ("Write error:\n" +e.getmessage ());
return false;
}finally{try {if (writer!=null) {writer.close ();
The catch (IOException e) {log.info ("Close writer error:\n" +e.getmessage ());
} return true;
/** * Write the cluster to writer * @param writer * @param cluster * @return * @throws IOException * * Private Boolean Writetxtcenter (Writer writer,iterable<cluster> clusters) t Hrows ioexception{for (cluster cluster:clusters) {String fmtstr = cluster.asformatstring (null);
System.out.println ("Fmtstr:" +fmtstr);
Writer.write (FMTSTR);
Writer.write ("\ n");
return true;
Public Configuration getconf () {return conf;
public void setconf (Configuration conf) {this.conf = conf;
Public Path Getcenterpathdir () {return centerpathdir;
} public void Setcenterpathdir (Path centerpathdir) {this.centerpathdir = Centerpathdir;
}/** * @return the OutputPath */public String Getoutputpath () {return outputpath; /** * @param outputpath the OutputPath to set */public void Setoutputpath (String outputpath)
{This.outputpath = OutputPath; }
}
Here is a test class: