K Nearest neighbor (k-nearest NEIGHBOUR,KNN) classification algorithm is one of the simplest machine learning algorithms.
The KNN method is more suitable than other methods because the KNN method mainly relies on the surrounding finite sample, rather than the Discriminant class domain method to determine the category of the class.
The functions of the algorithm are:
The Euclidean or Markov distances are sampled from the target area.
The heuristic optimal K neighborhood is selected based on the Rmse of cross-validation.
Calculates the reciprocal weighted average of the distance of the multivariate K nearest neighbor.
Machine learning (a)--k-nearest neighbor (KNN) algorithm-oyabea-Blog Park
Http://www.cnblogs.com/ybjourney/p/4702562.html
A preliminary study of K-Nearest neighbor algorithm-log of the mountain-NetEase blog
Http://blog.163.com/[email protected]/blog/static/127857195201472462736764/
However, KNN because the computational amount is quite large, so it is quite time-consuming, Ko and SEO proposed an algorithm TCFP (text categorization using feature projection), try to use the feature projection method to reduce the characteristics of the classification independent of the impact on the system, and to improve the system efficiency, in fact, experimental results show that its classification effect and K nearest neighbor method is similar, but its operation time required only K nearest Neighbor method One-fiftieth
In addition to the efficiency of file classification, there are studies on how to promote the K nearest neighbor method in the file classification effect, such as Han and others in 2002 to try to use the greedy method, for file classification to do the adjustment of the weight of the K nearest Neighbor Method Waknn (weighted adjusted k nearest Neighbor) to promote the classification effect
Li and others in 2004, because the different categories of documents themselves have a number of differences, it should also be in accordance with the training set of various categories of documents, the number of different numbers of recent neighbors, to participate in the classification.
Summarize:
1, the definition of k value, generally less than 20
2. Calculate Euclidean distance or Manhattan distance
3. Type of the nearest K-point to determine the sample type
The back-end programmer's Road 12, K nearest neighbor (k-nearest NEIGHBOUR,KNN) Classification algorithm