function [L,c] = KMEANSPP (x,k)
%kmeans Cluster multivariate data using the k-means++ algorithm.
% [L,c] = kmeans_pp (x,k) produces a 1-by-size (x,2) vector L with one class
% label per column in X and a size (x,1)-by-k matrix C containing the
% centers corresponding to each class.
% version:2013-02-08
% authors:laurent sorber ([email protected])
L = [];
L1 = 0;
While length (unique (L)) ~= K
% the k-means++ initialization.
% c is randomly picking a random point from X
C = X (:, 1+round (rand* (Size (x,2)-1)); %size (x,2) is the number of data points for data set X, and C is a collection of center points
L = Ones (1,size (x,2));
For i = 2:k
D = X-C (:, L); %-1, C expands at this point, and D corresponds to the set of each x-c.
D = cumsum (sqrt (dot (d,d,1))); % the distance from each data point to the center point, in turn, the Euclidean distance
If D (end) = = 0, C (:, i:k) = X (:, Ones (1,k-i+1)); Return End
C (:, i) = X (:, Find (Rand < D/D (end), 1)); The second parameter of%find indicates the number of indexes returned, and the greater the probability of d/d (end) distance
[~,l] = max (Bsxfun (@minus, 2*real (C ' *x), Dot (c,c,1).)); % bunker up, this sentence, classifies each data point.
End
% The K-means algorithm.
% any function: detects if there are non-0 elements in the matrix, returns 1 if there is, or returns 0.
While any (L ~= L1)
L1 = L;
For i = 1:k, L = l==i; C (:, i) = SUM (X (:, L), 2)/sum (l); End%l is a tribe index
[~,l] = max (Bsxfun (@minus, 2*real (C ' *x), Dot (c,c,1). '), [],1];
End
End
Clear all; Close all; Clc
X=[randn (3,2) *.4;randn (4,2) *.5+ones (4,1) *[4 4]];
[L, C] = KMEANSPP (x ', 2);
L
C
k-means++ code, written very well, looked at a long time to understand ....