The back-end programmer's Road 12, K nearest neighbor (k-nearest NEIGHBOUR,KNN) Classification algorithm

Source: Internet
Author: User

K Nearest neighbor (k-nearest NEIGHBOUR,KNN) classification algorithm is one of the simplest machine learning algorithms.
The KNN method is more suitable than other methods because the KNN method mainly relies on the surrounding finite sample, rather than the Discriminant class domain method to determine the category of the class.
The functions of the algorithm are:
The Euclidean or Markov distances are sampled from the target area.
The heuristic optimal K neighborhood is selected based on the Rmse of cross-validation.
Calculates the reciprocal weighted average of the distance of the multivariate K nearest neighbor.

Machine learning (a)--k-nearest neighbor (KNN) algorithm-oyabea-Blog Park
Http://www.cnblogs.com/ybjourney/p/4702562.html

A preliminary study of K-Nearest neighbor algorithm-log of the mountain-NetEase blog
Http://blog.163.com/[email protected]/blog/static/127857195201472462736764/

However, KNN because the computational amount is quite large, so it is quite time-consuming, Ko and SEO proposed an algorithm TCFP (text categorization using feature projection), try to use the feature projection method to reduce the characteristics of the classification independent of the impact on the system, and to improve the system efficiency, in fact, experimental results show that its classification effect and K nearest neighbor method is similar, but its operation time required only K nearest Neighbor method One-fiftieth

In addition to the efficiency of file classification, there are studies on how to promote the K nearest neighbor method in the file classification effect, such as Han and others in 2002 to try to use the greedy method, for file classification to do the adjustment of the weight of the K nearest Neighbor Method Waknn (weighted adjusted k nearest Neighbor) to promote the classification effect

Li and others in 2004, because the different categories of documents themselves have a number of differences, it should also be in accordance with the training set of various categories of documents, the number of different numbers of recent neighbors, to participate in the classification.

Summarize:
1, the definition of k value, generally less than 20
2. Calculate Euclidean distance or Manhattan distance
3. Type of the nearest K-point to determine the sample type

The back-end programmer's Road 12, K nearest neighbor (k-nearest NEIGHBOUR,KNN) Classification algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.