The K-Nearest neighbor algorithm for machine learning

Source: Internet
Author: User

Near the study of "machine learning Combat" This book, made some notes, and everyone to share the following:

An overview of the K-Nearest neighbor algorithm (KNN)

The simplest initial-level classifier is a record of all the classes corresponding to the training data, which can be categorized when the properties of the test object and the properties of a training object match exactly. But how is it possible that all the test objects will find the exact match of the training object, followed by the existence of a test object at the same time with more than one training object, resulting in a training object is divided into multiple classes of the problem, based on these problems, resulting in KNN.

KNN is classified by measuring the distance between different eigenvalues. The idea is that if a sample is the most similar in the K in the feature space (that is, the nearest neighbor in the feature space), the sample belongs to that category. K is usually an integer that is not greater than 20. In the KNN algorithm, the selected neighbors are the objects that have been correctly categorized. This method determines the category to which the sample is to be divided based on the category of the nearest one or several samples in the categorical decision-making.

The following is a simple example of how a green circle is to be determined by which class, is it a red triangle or a blue quad? If k=3, because the red triangle is the proportion of 2/3, the green circle will be given the red triangle that class, if k=5, because the blue four-square scale is 3/5, so the green circle is given the blue four-square class.

Second, the algorithm pseudo-code:

Algorithm for searching K nearest neighbors: KNN (a[n],k)

Input: A[n] is the coordinate of n training sample in space, K is the nearest neighbor number

Output: Category X belongs to

Take A[1]~a[k] as the initial neighbor of X, calculate the Euclidean distance D (X,a[i]) between the test sample X, i=1,2,....., K; by D (X,a[i]) ascending order, calculate the distance between the farthest sample and x d<-----max{d (x,a[j]) | j= ,....., k};

for (i=k+1;i<=n;i++)

Calculates the distance between A[i] and X D (X,a[i]);

if (d (x,a[i)) <d

Then use A[i] instead of the farthest sample

In ascending order of D (X,a[i]), calculate the distance between the farthest sample and x d<---max{d (x,a[j]) | j=1,..., i}; calculate the first k samples a[i]), i=1,2,..., The probability of the category of K, which is the class of the sample X with the maximum probability

Three, the algorithm summary

The K-Nearest neighbor algorithm is the simplest and most effective algorithm for classifying data, which is based on instance learning and must have a set of training samples close to actual data (which can reflect actual data) when using the algorithm. All training data must be saved using the K-nearest neighbor algorithm, if the data is large and occupies a large space. Also, for a sample classification, the distance from the samples in all the training samples must be calculated, which can be time consuming. The improved algorithm for K-nearest neighbor is decision tree.

The K-Nearest neighbor algorithm for machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.