3. K-means algorithm:The classical algorithm in 3.1 clustering, one of the ten classical algorithms of data Mining 3.2 algorithm accepts the parameter k; and then divides the N data objects into K clusters so that the obtained clusters satisfy: the object similarity in the same cluster is higher; the objects in different clusters The similarity is small. 3.3 Algorithm idea: Cluster in the center of K points in space, classify the objects closest to them. Through iterative method, the values of each cluster center are updated successively until the best clustering result 3.4 algorithm is described: (1) The initial center of C class is chosen appropriately, and (2) in the K-iteration, any A sample, the distance to the center of C, the sample to the shortest distance in the center of the class, (3) the use of means and other methods to update the center value of the class, (4) for all C cluster centers, if the use of (2) (3) of the overlapping After the method is updated, the value remains unchanged, the iteration ends, or the iteration continues.
3.5 Algorithm Flow:Input: K, data[n]; (1) Select K Initial center point, for example c[0]=data[0],... c[k-1]=data[k-1]; (2) for Data[0]....data[n], respectively, compared with c[0]...c[k-1], assuming that the difference between c[i] is the least, it is labeled I; (3) for all points marked as I, recalculate c[i]={all Data[j] marked with the sum of}/labeled I, and (4) repeat (2) (3) until all C[i] values change less than the given threshold value.
Machine Learning (5): K-means algorithm