The KNN algorithm implemented by Python

Source: Internet
Author: User

The KNN algorithm implemented by Python

KNN algorithm has many practical uses, mainly used in the classification stage, is a basic classification algorithm. KNN is mainly based on distance calculation, it is generally possible to calculate the distance between samples in the original Euclidean space. Improved versions include: First feature is extracted into a more discriminating space, then the distance is calculated, or the Metric learning metric learning technique is used to obtain a discriminant metric space and then the Markov distance between samples is calculated.
In any case, KNN can be used in many of the classification stages of algorithms, where we use Python to achieve KNN.

1. Sklearn's own KNN

From  sklearn.neighborsimport  nearestneighbors

You can call the nearest neighbor algorithm.

' Python implements KNN algorithm ' #只是返回近邻点, not classified from sklearn.neighbors import nearestneighbors  #加载最近邻算法samples = [[0, 0, 0], [0, 0.5 , 0], [1, 1, 0.5]]; Neigh = Nearestneighbors (n_neighbors=2)  #set The number of neighbors Neigh.fit (samples) print neigh.kneighbors ([1, 1 , 1]) #return the same number of neighbors #return the arrays, the first is the calculated distance; The second is the indexs of neighbors, strarting from 0 #实现分类from sklearn.neighbors import kneighborsclassifier KNNCLF = K Neighborsclassifier (N_neighbors=1)  #we set the k=1, while default with k=5 samples = [[0, 0, 0], [0, 0.5, 0], [1, 1, 0 .5]]  #training samples featureslabels = [0, 0, 1]  #the labels knnclf.fit (samples, labels) print knnclf.predict ([1 , 1, 1])  #return The classification label, that is, [1]


2. Source Code Implementation

I use the implementation of the first time, and then look at its source code, the comparison found that the use of Python has yet to be improved!


Your own implementation of the KNN code:


#编码实现KNNfrom numpy Import * import operator Def creatdataset ():     samples = [[0, 0, 0, 0.5], [0, 0.5, 0, 1], [1, 1, 0.5,  0]]  #training samples features    samples = Mat (samples)     labels = [0, 0, 1]  #the labels     return samples, Labels def knnclassifier (Trasamples, lables, K, tstsample):     Samnum,feadim = shape (trasamples);  # each line is one sample     mindist = ten    Classifiedlabel = labels[0] for    i in range (samnum):         tmpdist = (tr Asamples[i]-tstsample) * (Trasamples[i]-tstsample).  T  # Notice that tmpdist are a matrix here        print tmpdist         if (tmpdist[0][0] < mindist):  # since Tmpdist is a matrix             mindist = tmpdist             Classifiedlabel = labels[i]     



SOURCE KNN:

def classify0 (InX, DataSet, labels, k): Datasetsize = dataset.shape[0] # The number of Samples # tile function Is the same as "replicate" function of MATLAB # This technique avoids the loop statement Diffmat = Tile (InX, (datasetsize, 1))-DataSet # REPL Icate InX into datasetsize * 1 sqdiffmat = diffmat**2 # corresponding element squared sqdistances = sqdiffmat.sum (axis = 1) # Sum dis by line Tances = sqdistances**0.5 # the distance sorteddistindicies = Distances.argsort () # Argsort function returns the index value of the array value from small to large Classcou NT = {} # Poll for I in range (k): Voteilabel = labels[sorteddistindicies[i]] #排名第i近的样本的label Classcoun T[voteilabel] = classcount.get (Voteilabel, 0) + 1 #get字典的元素, or 0 # operator.itemgetter (1) sorted by value if no key exists;                              Lambda Asd:asd[1] # sorted, original classcount unchanged sortedclasscount = sorted (Classcount.iteritems (), # Key-value pairs Key = Operator.itemgetter (1), reverse = True) #逆序排列 return sortedclasscount[0][0] #输出第一个, i.e. nearest neighbor

There's a detailed explanation, Summary: Note using tile (), **2, **0.5, sum (axis = 1), array of Argsort (), Dictionary get (), and sorted usage.



The KNN algorithm implemented by Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.