K-Nearest Neighbor algorithm (KNN) is a basic classification and regression algorithm, and K-means is a basic clustering method.
K Nearest Neighbor algorithm (KNN)
The basic idea: if a sample in the feature space of the K most similar (that is, the closest feature space) of the sample most belong to a category, then the sample belongs to this category.
Impact factors:
- The choice of K value. The value of k is small, the approximate error is small, the error is large, the value of K is large, the approximate error is large, and the error is small. (approximate error is the error of the algorithm on the training set, and the estimation error is the error of the algorithm on the test set.) The approximate error is small, there will be the phenomenon of fitting, that is, the model in the training set on the prediction error is small, in the unknown test set is poor performance, the model is not the optimal model, the error is small, the model in the unknown test set of small error, the model is close to the optimal model);
- The calculation of the distance. The distance is calculated by the proximity of the sample to the feature space, usually with European or Manhattan distances;
- Classification decision rules. A majority vote is often used.
K-means algorithm
The basic steps of the K-means algorithm are as follows:
- Randomly selecting K samples as the central point of initialization;
- The distances of the other samples to the K-center points are calculated separately, and the samples are divided into the nearest center;
- The mean value of K-class sample is recalculated, and the new K-center point is obtained.
- Repeat steps 2 and 3 until the distance from the sample to the center point no longer changes, or the iteration reaches a certain number of times.
Reference:
- Https://www.cnblogs.com/PiPifamily/p/8520405.html
- 78794190
Comparison of K-Nearest neighbor algorithm (KNN) and K-means algorithm