A detailed explanation of the basic K-means instance of Python clustering algorithm and the k-means of python Clustering
This article describes the basic K-means operation techniques of the Python clustering algorithm. We will share this with you for your reference. The details are as follows:
Basic K-means: Select K initial centers, where K is the user-specified parameter, that is, the expected number of clusters. In each cycle, each vertex is assigned to the nearest center, and assigned to the point set of the same center to form one. Then, update the center of each cluster based on the points assigned to the cluster. Repeat the assignment and update operations until the centroid does not change significantly.
# Scoding = utf-8import pylab as plpoints = [int (eachpoint. split ("#") [0]), int (eachpoint. split ("#") [1])] for eachpoint in open ("points", "r")] # specify three initial centers currentCenter1 = [20,190]; currentCenter2 = [170,140]; currentCenter3 = [] pl. plot ([currentCenter1 [0], [currentCenter1 [1], 'OK') pl. plot ([currentCenter2 [0], [currentCenter2 [1], 'OK') pl. plot ([currentCenter3 [0], [currentCenter3 [1], 'OK') # records the update trajectory of the center of each cluster after each iteration. center1 = [currentCenter1]; center2 = [currentCenter2]; center3 = [currentCenter3] # three clusters group1 = []; group2 = []; group3 = [] for runtime in range (50 ): group1 = []; group2 = []; group3 = [] for eachpoint in points: # calculate the distance from each point to three centers distance1 = pow (abs (eachpoint [0]-currentCenter1 [0]), 2) + pow (abs (eachpoint [1]-currentCenter1 [1]), 2) distance2 = pow (abs (eachpoint [0]-currentCenter2 [0]), 2) + pow (abs (eachpoint [1]-currentCenter2 [1]), 2) distane3 = pow (abs (eachpoint [0]-currentCenter3 [0]), 2) + pow (abs (eachpoint [1]-currentCenter3 [1]), 2) # assign the point to the cluster mindis = min (distance1, distance2, distane3) if (mindis = distance1): group1.append (eachpoint) elif (mindis = distance2): group2.append (eachpoint) else: group3.append (eachpoint) # After all the points are assigned, update the center of each cluster currentCenter1 = [sum ([eachpoint [0] for eachpoint in group1])/len (group1), sum ([eachpoint [1] for eachpoint in group1]) /len (group1)] currentCenter2 = [sum ([eachpoint [0] for eachpoint in group2])/len (group2 ), sum ([eachpoint [1] for eachpoint in group2])/len (group2)] currentCenter3 = [sum ([eachpoint [0] for eachpoint in group3]) /len (group3), sum ([eachpoint [1] for eachpoint in group3])/len (group3)] # record this centroid update center1.append (currentCenter1) center2.append (currentCenter2) center3.append (currentCenter3) # print all vertices and color the cluster pl to which the point belongs. plot ([eachpoint [0] for eachpoint in group1], [eachpoint [1] for eachpoint in group1], 'or') pl. plot ([eachpoint [0] for eachpoint in group2], [eachpoint [1] for eachpoint in group2], 'oy') pl. plot ([eachpoint [0] for eachpoint in group3], [eachpoint [1] for eachpoint in group3], 'og ') # print the update track of the center of each cluster in [center1, center2, center3]: pl. plot ([eachcenter [0] for eachcenter in center], [eachcenter [1] for eachcenter in center], 'k') pl. show ()
The running effect is as follows:
I hope this article will help you with Python programming.