Recent data mining experiments, write a K-means Algorithm , it is not difficult to write, the process of writing to think of Python must have a package, although senior said not to use, but himself also finished, and with the package, is not very familiar, a little to find the next information, learned the next. In addition, their own writing is too rotten, dare not to take out, follow-up improvement and then write it out.
1. Note that the initial point needs to be converted to the Numpy.array array format.
2. If the direct calculation of the center point, directly call the KMEANS2 function on the line, the following drawing, only for visualization.
#!/usr/bin/python Import numpyimport matplotlibimport osmatplotlib.use (' Agg ') from SCIPY.CLUSTER.VQ import *import Pylabpylab.close () Xy1=[[2,10],[2,5],[8,4],[5,8],[7,5],[6,4],[1,2],[4,9],[7,3],[1,3]]xy2=numpy.array (XY1) Cluster_num=3res, idx = kmeans2 (Numpy.array (Zip (xy2[:,0],xy2[:,1)), cluster_num) print "Local Centre points:\n", Rescolors = ([[0.4,1,0.4],[1,0.4,0.4],[0.1,0.8,1]) [i] for i in IDX]) # Plot colored Pointspylab.scatter (xy2[:,0],xy2[:, 1]) # Mark Centroids as (X) Pylab.scatter (res[:,0],res[:,1], marker= ' o ', s = x, linewidths=2, c= ' None ') pylab.scatter (res [:, 0],res[:,1], marker= ' x ', S = $, linewidths=2) #print os.getcwd () pylab.savefig (' Pic.png ')
:
#---------------------------------------------------------------------------
Reference: http://blog.csdn.net/brandohero/article/details/39967663
#!/usr/bin/python # adapted from http://hackmap.blogspot.com/2007/09/k-means-clustering-in-scipy.html import Numpyimport matplotlibmatplotlib.use (' Agg ') from SCIPY.CLUSTER.VQ import *import pylabpylab.close () # Generate 3 sets of n Ormally distributed points around# different means with different VARIANCESPT1 = Numpy.random.normal (1, 0.2, (100,2)) Pt2 = Numpy.random.normal (2, 0.5, (300,2)) Pt3 = Numpy.random.normal (3, 0.3, (100,2)) # Slightly move sets 2 and 3 (for a Pretti ER output) pt2[:,0] + = 1pt3[:,0]-= 0.5 xy = numpy.concatenate ((pt1, Pt2, PT3)) # Kmeans for 3 clustersres, idx = KMEANS2 (n Umpy.array (Zip (xy[:,0],xy[:,1])), 3) colors = ([([0.4,1,0.4],[1,0.4,0.4],[0.1,0.8,1]) [i] for i in IDX]) # Plot colored poi Ntspylab.scatter (xy[:,0],xy[:,1], c=colors) # Mark Centroids as (X) Pylab.scatter (res[:,0],res[:,1], marker= ' o ', s = 500, linewidths=2, c= ' None ') pylab.scatter (res[:,0],res[:,1], marker= ' x ', S = $, linewidths=2) pylab.savefig ('/tmp/ Kmeans.png ')
#------------------------------------
Reprint please certify at: http://write.blog.csdn.net/postedit/41158167
Python K-means tool Bao Solution