Python k-Nearest Neighbor Algorithm instance sharing

Source: Internet
Author: User

Simple Description

This algorithm is mainly used to measure the distance between different feature values. With this distance, you can classify them.

KNN for short.

Known: the training set and the label of each training set.

Next, compare with the data in the training set to calculate the most similar k Distance. Select the category with the most similarity data. As the classification of new data.

Python instance

Copy codeThe Code is as follows:
#-*-Coding: cp936 -*-

# In Windows, cp936 encoding is used. It is better to use UTF-8 in linux.
From numpy import * # introduce the scientific computing package
Import operator # classic python function library. Operator module.

# Create a dataset
Def createDataSet ():
Group = array ([[1.0, 1.1], [1.0, 1.0], [0.1], [0,])
Labels = ['A', 'A', 'B', 'B']
Return group, labels

# Algorithm core
# Classification: input vector used for classification. It will be classified soon.
# DataSet: training sample set
# Labels: Label Vector
Def classfy0 (datasets, dataSet, labels, k ):
# Distance Calculation
DataSetSize = dataSet. shape [0] # obtain the number of rows in the array. I know that there are several training data sets.
DiffMat = tile (partition, (dataSetSize, 1)-dataSet # tile: Functions in numpy. Tile expands the original array into four identical arrays. DiffMat obtains the difference between the target and the training value.
SqDiffMat = diffMat ** 2 # each element is square
SqDistances = sqDiffMat. sum (axis = 1) # multiply the corresponding column to obtain the square of each distance.
Distances = sqDistances ** 0.5 # Start, get the distance.
SortedDistIndicies = distances. argsort () # sort in ascending order
# Select the nearest k points.
ClassCount = {}
For I in range (k ):
VoteIlabel = labels [sortedDistIndicies [I]
ClassCount [voteIlabel] = classCount. get (voteIlabel, 0) + 1
# Sorting
SortedClassCount = sorted (classCount. iteritems (), key = operator. itemgetter (1), reverse = True)
Return sortedClassCount [0] [0]

Unexpected gains

Add a self-written module to the default search path of python: Create a xxx. pth file under the python/lib/-packages directory and write it to the path of the Self-written module.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.