1 " "2 Created on Nov3 knn:k Nearest Neighbors4 5 Input:inX:vector to compare to existing dataset (1xN)6 dataset:size m data set of known vectors (NxM)7 Labels:data Set labels (1xM vector)8 K:number of neighbors to use for comparison (should is an odd number)9 Ten output:the Most Popular class label One A @author: Liu Chuanfeng - " " - Importoperator the ImportNumPy as NP - ImportMatplotlib.pyplot as Plt - - defclassify0 (InX, DataSet, labels, k): +Datasetsize =Dataset.shape[0] -Diffmat = Np.tile (InX, (datasetsize,1))-DataSet +Sqdiffmat = Diffmat * * 2 ASqdistances = Sqdiffmat.sum (Axis=1) atdistances = sqdistances * * 0.5 -Sorteddistindicies =Distances.argsort () -ClassCount = {} - forIinchRange (k): -Voteilabel =Labels[sorteddistindicies[i]] -Classcount[voteilabel] = Classcount.get (voteilabel,0) + 1 inSortedclasscount = sorted (Classcount.items (), key = Operator.itemgetter (1), reverse =True) - returnSortedclasscount[0][0] to + defFile2matrix (filename): -FR =open (filename) theArraylines =Fr.readlines () *NumberOfLines =Len (arraylines) $Returnmat = Np.zeros ((NumberOfLines, 3))Panax NotoginsengClasslabelvector = [] -index =0 the forLineinchArraylines: +line =Line.strip () AListfromline = Line.split ('\ t') theReturnmat[index,:] = Listfromline[0:3] +Classlabelvector.append (int (listfromline[-1])) -Index + = 1 $ returnReturnmat, Classlabelvector $ - defAutonorm (dataSet): -Maxvals =Dataset.max (0) theMinvals =dataset.min (0) -ranges = Maxvals-minvalsWuyim =Dataset.shape[0] theNormdataset = (Dataset-np.tile (minvals, (M, 1))/Np.tile (ranges, (M, 1)) - returnnormdataset, ranges, minvals Wu - defdatingclasstest (): AboutHoRatio = 0.10 $Datingdatamat, datinglabels = File2matrix ('DatingTestSet2.txt') -Normmat, ranges, minvals =autonorm (Datingdatamat) -m =Normmat.shape[0] -numtestvecs = Int (M *hoRatio) AErrorcount = 0.0 + forIinchRange (numtestvecs): theClassifyresult = Classify0 (Normmat[i,:], normmat[numtestvecs:m,:], datinglabels[numtestvecs:m], 3) - Print('Theclassifier came back with:%d, the real answer is:%d'%(Classifyresult, datinglabels[i])) $ if(Classifyresult! =Datinglabels[i]): theErrorcount + = 1.0 the Print('The total error rate is:%.1f%%'% (Errorcount/float (numtestvecs) * 100)) the the defClassifyperson (): -Resultlist = [' not at all','In small doses','In large doses'] inPercenttats = float (Input ("percentage of time spent playing video games?")) theFfmiles = float (Input ("frequent flier miles earned per year?")) theIcecream = float (Input ("liters of ice cream consumed per year?")) AboutDatingdatamat, datinglabels = File2matrix ('DatingTestSet2.txt') theNormmat, ranges, minvals =autonorm (Datingdatamat) theInarr =Np.array ([Ffmiles, Percenttats, icecream]) theClassifyresult = Classify0 ((inarr-minvals)/ranges, Normmat, Datinglabels, 3) + Print("You'll probably like this Persoon:", resultlist[classifyresult-1]) - the #Unit test of Func:file2matrix ()Bayi #Datingdatamat, datinglabels = File2matrix (' datingTestSet2.txt ') the #print (Datingdatamat) the #print (datinglabels) - - #Usage of figure construction of Matplotlib the #fig=plt.figure () the #ax = fig.add_subplot (111) the #Ax.scatter (datingdatamat[:,1], datingdatamat[:,2], 15.0*np.array (datinglabels), 15.0*np.array (DatingLabels)) the #plt.show () - the #Unit test of Func:autonorm () the #Normmat, ranges, minvals = Autonorm (Datingdatamat) the #print (Normmat)94 #print (ranges) the #print (minvals) the the datingclasstest ()98Classifyperson ()
Output:
Theclassifier came back with:3, the real answer Is:3
The total error rate is:0.0%
Theclassifier came back with:2, the real answer Is:2
The total error rate is:0.0%
Theclassifier came back with:1, the real answer is:1
The total error rate is:0.0%
...
Theclassifier came back with:2, the real answer Is:2
The total error rate is:4.0%
Theclassifier came back with:1, the real answer is:1
The total error rate is:4.0%
Theclassifier came back with:3, the real answer is:1
The total error rate is:5.0%
Percentage of time spent playing video games?10
Frequent flier miles earned per year?10000
Liters of ice cream consumed per year?0.5
You'll probably like this persoon:in small doses
Reference:
"Machine learning Combat"
K Nearest Neighbor Algorithm python implementation--"machine learning Combat"