Python handwriting recognition system based on KNN algorithm

Source: Internet
Author: User

Tag:float   gdi    Add     two value image    shape    ext    Environment    color    presentation    

This paper mainly uses K-nearest neighbor classifier to realize handwriting recognition system, training data set about 2000 samples, each number has about 200 samples, each sample is saved in a TXT file, the handwriting image itself is a 32x32 two value image, as shown in:  Test code for handwritten numeral recognition system: From numpy import *import operatorfrom os import listdir   #inX     The data to be detected #dataset    DataSet #labels    result set #k      length to compare def  Classify0 (inx, dataset, labels, k):    datasetsize = dataset.shape[0]                  #计算有多少行     # tile (inx,  (datasetsize,1)) Generate a matrix of corresponding InX dimensions for easy     Diffmat = tile (inx,  (datasetsize,1))  - dataSet    sqdiffmat = diffmat**2                          #差求平方     sqdistances = sqdiffmat.sum (Axis=1)               # axis=0,  represents a column  axis=1,  represents a row.     distances = sqdistances**0.5  &NBsp                 #开方     sorteddistindicies =  Distances.argsort ()         #argsort () sort, subscript     classcount={}    For i  in range (k):        voteilabel = labels[sorteddistindicies[i]]    #通过下标索引分类         #  by constructing a dictionary, recording classification frequency         Classcount[voteilabel]  = classcount.get (voteilabel,0)  + 1        #  sort fields by value (from big to small)         sortedclasscount = sorted (Classcount.items (), Key=lambda classcount:classcount [1], reverse=true)     return sortedclasscount[0][0]  #手写字体识别 # First, we need to format the image as a vector, #   Converts a 32x32 binary image matrix through the Img2vector () function to the 1x1024 vector: def img2vector (filename):    returnvect =  zeros (1,1024)     fr = open (filename)     For i in range (+): &NBSP, #图片矩阵为32 *32        Linestr = fr.readline ()     #数据量大, so use readline        For j in range (+):            returnvect[0,32*i+j]& Nbsp;= int (Linestr[j])     return returnvect  #手写字体识别def  handwritingclasstest ():     hwlabels = []    Trainingfilelist = listdir (R ' trainingdigits ')             #指定文件夹     M = len (trainingfilelist)         & nbsp                         #获取文件夹个数     trainingmat  = zeros (m,1024)                             &N BSP, #构造m个1024比较矩阵     For i in range (m):        filenamestr =  trainingfilelist[i]          #获取文件名 &NBSp       Filestr = filenamestr.split ('. ') [0]        #按点把文件名字分割         Classnumstr = int (Filestr.split ('_') [0] )     #按下划线把文件名字分割         Hwlabels.append (CLASSNUMSTR)           & nbsp     #实际值添加保存         Trainingmat[i,:] = img2vector (R ' trainingdigits/%s '   % FILENAMESTR)         Testfilelist = listdir (' testdigits ')         #测试数据         errorcount = 0.0        Mtest = len ( testfilelist)     For i in range (mtest): #同上, processing test data         FILENAMESTR  = testFileList[i]        filestr = filenamestr.split ('. ') [0]      #take  off .txt        Classnumstr = int (filestr.split (‘_‘) [0])   &NBSp     Vectorundertest = img2vector (R ' testdigits/%s '  % filenamestr)         Classifierresult = classify0 (vectorundertest, trainingmat, hwlabels, 3)         print  ("Calculated value: %d,  actual value:  %d"  %  (classifierresult,  CLASSNUMSTR)         if  (CLASSIFIERRESULT != CLASSNUMSTR):  errorcount  += 1.0    print  ("\ n error occurrences:  %d"  % errorcount)     print  ("\ N Error Rate:  %f " %  (Errorcount/float (mtest))) handwritingclasstest () result:  calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9  error occurred: 10  error Rate: 0.010571  can see the KNN algorithm for memory consumption is very large (i 12G), Chinese environment recognition I can't imagine.

Python's handwriting recognition system based on KNN algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.