K-Nearest Neighbor algorithm (KNN)
K Nearest neighbor algorithm is to determine an exact number k value, and then in the vicinity of the current point to find the minimum distance k points, and then in this k point inside these points are classified, the most probability of the most current point of classification.
Sorting step:
1, calculates the distance between the point in the known dataset and the current point.
2, to sort the distance.
3, select the nearest K point
4, to determine the probability of the first K-point occurrence category
5, returning the highest frequency of the first K points as the current point of prediction classification
Characteristics:
1,KNN algorithm itself is simple and effective, is a lazy-learning algorithm
2, the classifier does not need the training set to practice, the training time is 0
3, the time complexity of the classifier and training of several documents in direct proportion, that is, if the number of documents is N, then the time complexity of KNN is O (n).
If it is a numerical value, it is certainly feasible to find the distance, but if it is a picture. The picture is not a numerical value of this argument, so ask for distance, you can only use pixels. The pixel subtraction of two images is the distance.
Find the distance formula, find the adjacent distance and Add. Like what:
Python code:
You can use the second method of finding a distance, the square difference:
But k nearest neighbor is actually not suitable for the picture classification, the effect is not good, because for example has one picture is the sky and the airplane, another is the train and the airplane, two picture's background if is similar, their distance will be very small, thus will cause the error.