A KNN algorithm for recognizing handwritten numbers is written, as shown in. Refer to link http://blog.csdn.net/april_newnew/article/details/44176059.
#-*-coding:utf-8-*-ImportNumPy as NPImportPandas as PDImportOSdefreadtxt (filename): Text=[] f= open (filename,'R', encoding='Utf-8') forLineinchf.readlines (): Text.append (line) txt=list (text) txt=np.array (txt,dtype='float') txt=txt.tolist ()returntxtdefReadData (rootfile): Data=[] Label= [] forRoot,dirs,filesinchOs.walk (rootfile): forNameinchFiles:filename= root +'\\'+Name txt=readtxt (filename) data.append (TXT) label1= Name.split ('_') [0] Label.append (label1) data=PD. DataFrame (data)returnData,labeldefKNN (traindata,trainlabel,testdatai,k): Length=Len (traindata) newtest= Np.tile (Testdatai, (length,1)) Newtest=PD. DataFrame (newtest) diff= Newtest-traindata diff= Diff**2Cha= Diff.sum (Axis=1) Cha= cha**0.5result= PD. DataFrame ({'label': Trainlabel,'Cha': Cha}) Labels= Result.sort_values (by='Cha') [: K] Frequent=labels.groupby (labels['label']). Size () labely=Frequent.argmax ()returnlabelydefTest (TRAINFILE,TESTFILE,K): Result=[] traindata, Trainlabel=ReadData (trainfile) testdata, TestLabel=ReadData (testfile) forIinchRange (len (testdata)): LabelY=KNN (traindata,trainlabel,testdata.loc[i,:],k) result.append (labely) Tongji= PD. DataFrame ({'result': result,'TestLabel': TestLabel}) Accuary= Len (tongji[tongji['result']==tongji['TestLabel']])/len (Result)returnresult,accuary Trainfile=r'E:\trainingDigits'testfile=r'E:\testDigits'K=3result, Accuary=Test (TRAINFILE,TESTFILE,K)
Note: There are 2,210 records in the training data set, and there are 670 test data. The accuracy rate is not high, only 0.45. Currently do not know why, after learning more, strive to optimize the code.
Python implements KNN to recognize handwritten numbers