所用資料檔案:data1.txt
@RELATION data1@ATTRIBUTE one REAL@ATTRIBUTE two REAL@DATA0.184000 0.4820000.152000 0.5400000.152000 0.5960000.178000 0.6260000.206000 0.5980000.230000 0.5620000.224000 0.5240000.204000 0.5400000.190000 0.5720000.216000 0.6080000.240000 0.6260000.256000 0.5840000.272000 0.5460000.234000 0.4680000.222000 0.4900000.214000 0.4140000.252000 0.3360000.298000 0.3360000.316000 0.3760000.318000 0.4340000.308000 0.4800000.272000 0.4080000.272000 0.4620000.280000 0.5240000.296000 0.5440000.340000 0.5340000.346000 0.4220000.354000 0.3560000.160000 0.2820000.160000 0.2820000.156000 0.3980000.138000 0.4660000.154000 0.4420000.180000 0.3340000.184000 0.3000000.684000 0.4200000.678000 0.4940000.710000 0.5920000.716000 0.5080000.744000 0.5280000.716000 0.5400000.692000 0.5400000.696000 0.4940000.722000 0.4660000.738000 0.4740000.746000 0.4840000.750000 0.5000000.746000 0.4400000.718000 0.4460000.692000 0.4660000.746000 0.4180000.768000 0.4600000.272000 0.2900000.240000 0.3760000.212000 0.4100000.154000 0.5640000.252000 0.7040000.298000 0.7140000.314000 0.6680000.326000 0.5660000.344000 0.4680000.324000 0.6320000.164000 0.6880000.216000 0.6840000.392000 0.6820000.392000 0.6280000.392000 0.5180000.398000 0.5020000.392000 0.3640000.360000 0.3080000.326000 0.3080000.402000 0.3420000.404000 0.4180000.634000 0.4580000.650000 0.3780000.698000 0.3480000.732000 0.3500000.766000 0.3640000.800000 0.3880000.808000 0.4280000.826000 0.4660000.842000 0.5100000.842000 0.5560000.830000 0.5940000.772000 0.6460000.708000 0.6540000.632000 0.6400000.628000 0.5640000.624000 0.3520000.650000 0.2860000.694000 0.2420000.732000 0.2140000.832000 0.2140000.832000 0.2640000.796000 0.2800000.778000 0.2880000.770000 0.2940000.892000 0.3420000.910000 0.3660000.910000 0.3940000.872000 0.3820000.774000 0.3140000.718000 0.2520000.688000 0.2840000.648000 0.3220000.602000 0.4600000.596000 0.4960000.570000 0.5500000.564000 0.5920000.574000 0.6240000.582000 0.6440000.596000 0.6640000.662000 0.7040000.692000 0.7220000.710000 0.7360000.848000 0.7320000.888000 0.6860000.924000 0.5140000.914000 0.4700000.880000 0.4920000.848000 0.7060000.730000 0.7360000.676000 0.7340000.628000 0.7320000.782000 0.7080000.806000 0.6740000.830000 0.6300000.564000 0.7300000.554000 0.5380000.570000 0.5020000.572000 0.4320000.590000 0.3560000.652000 0.2320000.676000 0.1780000.684000 0.1520000.728000 0.1720000.758000 0.1480000.864000 0.1760000.646000 0.2420000.638000 0.2540000.766000 0.2760000.882000 0.2780000.900000 0.2780000.906000 0.3020000.892000 0.3160000.570000 0.3240000.798000 0.1500000.832000 0.1140000.714000 0.1560000.648000 0.1540000.644000 0.2120000.642000 0.2500000.658000 0.2840000.710000 0.2960000.794000 0.2880000.846000 0.2600000.856000 0.3040000.858000 0.3920000.858000 0.4760000.778000 0.6400000.736000 0.6620000.718000 0.6900000.634000 0.6920000.596000 0.7100000.570000 0.7200000.554000 0.7320000.548000 0.6860000.524000 0.7400000.598000 0.7680000.660000 0.796000
前言:Kmeans是一種非常經典的聚類演算法。它利用簇的中心到對象的距離來分配每個對象的簇所屬關係。同時迭代的進行簇的中心的更新以及簇分配的更新,直到收斂。
下面是調用weka包中實現的kmeans的代碼
package others;import java.io.File;import weka.clusterers.SimpleKMeans;import weka.core.DistanceFunction;import weka.core.Instances;import weka.core.converters.ArffLoader;public class ArrayListTest {public static void main(String[] args){Instances ins = null;SimpleKMeans KM = null;DistanceFunction disFun = null;try {// 讀入樣本資料File file = new File("data/data1.txt");ArffLoader loader = new ArffLoader();loader.setFile(file);ins = loader.getDataSet();// 初始化聚類器 (載入演算法)KM = new SimpleKMeans();KM.setNumClusters(4); //設定聚類要得到的類別數量KM.buildClusterer(ins);//開始進行聚類System.out.println(KM.preserveInstancesOrderTipText());// 列印聚類結果System.out.println(KM.toString());} catch(Exception e) {e.printStackTrace();}}}