The K-Nearest neighbor algorithm for machine learning

Last Update:2018-02-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Near the study of "machine learning Combat" This book, made some notes, and everyone to share the following:

An overview of the K-Nearest neighbor algorithm (KNN)

The simplest initial-level classifier is a record of all the classes corresponding to the training data, which can be categorized when the properties of the test object and the properties of a training object match exactly. But how is it possible that all the test objects will find the exact match of the training object, followed by the existence of a test object at the same time with more than one training object, resulting in a training object is divided into multiple classes of the problem, based on these problems, resulting in KNN.

KNN is classified by measuring the distance between different eigenvalues. The idea is that if a sample is the most similar in the K in the feature space (that is, the nearest neighbor in the feature space), the sample belongs to that category. K is usually an integer that is not greater than 20. In the KNN algorithm, the selected neighbors are the objects that have been correctly categorized. This method determines the category to which the sample is to be divided based on the category of the nearest one or several samples in the categorical decision-making.

The following is a simple example of how a green circle is to be determined by which class, is it a red triangle or a blue quad? If k=3, because the red triangle is the proportion of 2/3, the green circle will be given the red triangle that class, if k=5, because the blue four-square scale is 3/5, so the green circle is given the blue four-square class.

Second, the algorithm pseudo-code:

Algorithm for searching K nearest neighbors: KNN (a[n],k)

Input: A[n] is the coordinate of n training sample in space, K is the nearest neighbor number

Output: Category X belongs to

Take A[1]~a[k] as the initial neighbor of X, calculate the Euclidean distance D (X,a[i]) between the test sample X, i=1,2,....., K; by D (X,a[i]) ascending order, calculate the distance between the farthest sample and x d<-----max{d (x,a[j]) | j= ,....., k};

for (i=k+1;i<=n;i++)

Calculates the distance between A[i] and X D (X,a[i]);

if (d (x,a[i)) <d

Then use A[i] instead of the farthest sample

In ascending order of D (X,a[i]), calculate the distance between the farthest sample and x d<---max{d (x,a[j]) | j=1,..., i}; calculate the first k samples a[i]), i=1,2,..., The probability of the category of K, which is the class of the sample X with the maximum probability

Three, the algorithm summary

The K-Nearest neighbor algorithm is the simplest and most effective algorithm for classifying data, which is based on instance learning and must have a set of training samples close to actual data (which can reflect actual data) when using the algorithm. All training data must be saved using the K-nearest neighbor algorithm, if the data is large and occupies a large space. Also, for a sample classification, the distance from the samples in all the training samples must be calculated, which can be time consuming. The improved algorithm for K-nearest neighbor is decision tree.

The K-Nearest neighbor algorithm for machine learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The K-Nearest neighbor algorithm for machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The K-Nearest neighbor algorithm for machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support