1. Overview 1.1 Principle: (Measure the distance between different eigenvalues to classify)
There is a collection of sample data, which is the training sample set, and each data in the sample set has multiple features and labels, that is, we know the sample data and its classification, and when we enter new data without labels, we compare each feature of the new data with the characteristics of the data in the sample set. Then, according to the corresponding algorithm (the Euclidean distance chosen in this section), the most similar data from the sample set is extracted to classify the label. The most common choice is the nearest K-nearest data in the sample set, i.e. K-neighbor. Finally, select the classification of the most new data of the most frequently occurring categories in the K most similar data.
1.2 KNN algorithm
Take advantage of a set of test data:
K-Nearest Neighbor algorithm (KNN)