Python handwriting recognition system based on KNN algorithm

Last Update:2017-06-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tag:float gdi Add two value image shape ext Environment color presentation

This paper mainly uses K-nearest neighbor classifier to realize handwriting recognition system, training data set about 2000 samples, each number has about 200 samples, each sample is saved in a TXT file, the handwriting image itself is a 32x32 two value image, as shown in: Test code for handwritten numeral recognition system: From numpy import *import operatorfrom os import listdir #inX The data to be detected #dataset DataSet #labels result set #k length to compare def Classify0 (inx, dataset, labels, k): datasetsize = dataset.shape[0] #计算有多少行 # tile (inx, (datasetsize,1)) Generate a matrix of corresponding InX dimensions for easy Diffmat = tile (inx, (datasetsize,1)) - dataSet sqdiffmat = diffmat**2 #差求平方 sqdistances = sqdiffmat.sum (Axis=1) # axis=0, represents a column axis=1, represents a row. distances = sqdistances**0.5 &NBsp #开方 sorteddistindicies = Distances.argsort () #argsort () sort, subscript classcount={} For i in range (k): voteilabel = labels[sorteddistindicies[i]] #通过下标索引分类 # by constructing a dictionary, recording classification frequency Classcount[voteilabel] = classcount.get (voteilabel,0) + 1 # sort fields by value (from big to small) sortedclasscount = sorted (Classcount.items (), Key=lambda classcount:classcount [1], reverse=true) return sortedclasscount[0][0] #手写字体识别 # First, we need to format the image as a vector, # Converts a 32x32 binary image matrix through the Img2vector () function to the 1x1024 vector: def img2vector (filename): returnvect = zeros (1,1024) fr = open (filename) For i in range (+): &NBSP, #图片矩阵为32 *32 Linestr = fr.readline () #数据量大, so use readline For j in range (+): returnvect[0,32*i+j]& Nbsp;= int (Linestr[j]) return returnvect #手写字体识别def handwritingclasstest (): hwlabels = [] Trainingfilelist = listdir (R ' trainingdigits ') #指定文件夹 M = len (trainingfilelist) & nbsp #获取文件夹个数 trainingmat = zeros (m,1024) &N BSP, #构造m个1024比较矩阵 For i in range (m): filenamestr = trainingfilelist[i] #获取文件名 &NBSp Filestr = filenamestr.split ('. ') [0] #按点把文件名字分割 Classnumstr = int (Filestr.split ('_') [0] ) #按下划线把文件名字分割 Hwlabels.append (CLASSNUMSTR) & nbsp #实际值添加保存 Trainingmat[i,:] = img2vector (R ' trainingdigits/%s ' %&NBSP;FILENAMESTR) Testfilelist = listdir (' testdigits ') #测试数据 errorcount = 0.0 Mtest = len ( testfilelist) For i in range (mtest): #同上, processing test data FILENAMESTR = testFileList[i] filestr = filenamestr.split ('. ') [0] #take off .txt Classnumstr = int (filestr.split (‘_‘) [0]) &NBSp Vectorundertest = img2vector (R ' testdigits/%s ' % filenamestr) Classifierresult = classify0 (vectorundertest, trainingmat, hwlabels, 3) print ("Calculated value: %d, actual value: %d" % (classifierresult, CLASSNUMSTR) if (CLASSIFIERRESULT&NBSP;!=&NBSP;CLASSNUMSTR): errorcount += 1.0 print ("\ n error occurrences: %d" % errorcount) print ("\ N Error Rate: %f " % (Errorcount/float (mtest))) handwritingclasstest () result: calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 Calculated value: 9, actual value: 9 error occurred: 10 error Rate: 0.010571 can see the KNN algorithm for memory consumption is very large (i 12G), Chinese environment recognition I can't imagine.

Python's handwriting recognition system based on KNN algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python handwriting recognition system based on KNN algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python handwriting recognition system based on KNN algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support