What's xxx
K-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanic allows clusters to have different shapes.
Given a set of observations $ (x_1, x_2 ,..., X_n) $, where each observation is a D-dimen1_real vector, K-means clustering aims to partition the n observations into k sets (K ≤ n) $ S = {S_1, s_2 ,..., S_k} $ so as to minimize the within-cluster sum of squares sum (WCSS ):
$ \ Underset {\ mathbf {s }}{\ operatorname {Arg \, min }}\ sum _ {I = 1} ^ {k} \ sum _ {\ mathbf x_j \ In s_ I} \ left \ | \ mathbf x_j-\ boldsymbol \ mu_ I \ Right \ | ^ 2 $
Where $ μ _ I $ is the mean of points in $ s_ I $.
Algorithm
1. assignment step: $ s_ I ^ {(t) }=\ big \ {x_p: \ big \ | x_p-m ^ {(t )} _ I \ big \ | ^ 2 \ Le \ big \ | x_p-m ^ {(t)} _ j \ big \ | ^ 2 \ forall J, 1 \ le J \ le k \ big \} $,
Where each $ x_p $ is assigned to exactly one $ s ^ {(t) }$, even if it cocould be is assigned to two or more of them.
2. Update step: Calculate the new means to be the centroids of the observations in the new clusters.
$ M ^ {(t + 1) }_ I = \ frac {1} {| s ^ {(t )} _ I |} \ sum _ {x_j \ In s ^ {(t)} _ I} x_j $
Since the arithmetic mean is a least-squares estimator, this also minimizes the within-cluster sum of squares (WCSS) objective.