#-*-coding:utf-8-*-"""Created on Wed Mar 7 09:17:17 2018@author:admin"""########################################################KNN Cluster#Author:niucas#date:2-18-03-07#homepage:http://www.cnblogs.com/pipifamily/#Email:[email protected]#Naming rules with hump naming########################################################Import the corresponding packageImportNumPy as NPImportMatplotlib.pyplot as PltImportPandas as PDImportoperator#############Preparing Data############defCreatdataset ():" "the function is to create a training data set that corresponds to the label (label) of each data" "TestData=Np.array ([[1.0,1.1], [1.0,1.1], [0.,0], [0.,0.1]]) TestLabel= ['A','A','B','B'] returnTestdata,testlabeldefclassify (indata,dataset,labels,k):" "This function is used for classification; the parameters are defined as follows: InData: Input data DataSet: Training data Set Labels: training data corresponding to labels K: Selected K value" "datasetsize=Dataset.shape[0]#The first step is to calculate the Euclidean distancediff = Np.tile (inData, (datasetsize,1))-DataSet#Training data sets the difference between each object and the test dataDistance = ((diff**2). SUM (Axis=1)) * * 0.5#sum of squares of difference values ############# #Start a poll #############Sorteddistindex = Np.argsort (distance)#The Euclidean distance is arranged in ascending order, the result is the index number #The dictionary that holds the final result, initializedClassCount = {} #voting is the largest class of probabilities in the first k values, which labels value is the largest forIinchRange (k): Votelabel=Labels[sorteddistindex[i]] Classcount[votelabel]= Classcount.get (votelabel,0) + 1#sorts the results of the classification in ClassCount, returning the label with the highest number of votesSortedclasscount = sorted (Classcount.items (), Key=operator.itemgetter (1), reverse = True)#in descending order, the default is ascending order #class = Max (Classcount.items (), Key = lambda x:x[1]) [0] returnSortedclasscount[0][0]if __name__=="__main__": #Import DataDataset,labels =creatdataset () inData= [0.1,0.1] #Simple ClassificationClassName = Classify (indata,dataset,labels,3) Print('The class of test sample is%s'%className)
Code implementation of KNN algorithm