Special Collection Analysis
The dataset is Letter-recognition.data, with a total of 20,000 data, separated by commas, the data instance is shown below, the first column is the letter mark, and the remainder is a different feature. t,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8
Learning methods
1. Read in the data and remove the separator number
2, the first column of data as a marker, the rest of the training data
3. Initialize the classifier and train with training data
4, the use of test data to verify the accuracy rate
Code
<span style= "FONT-SIZE:14PX;" ></span><span style= "Font-family:courier new;font-size:12px;" >import cv2import NumPy as Npimport matplotlib.pyplot as Pltprint ' load data ' data = Np.loadtxt (' Letter-recognition.dat A ', Dtype = ' float32 ', delimiter = ', ', converters= {0:lambda ch:ord (ch)-ord (' A ')}) print ' Split as train, Test ' train,test = Np.vsplit (data,2) print ' train.shape:\t ', Train.shapeprint ' test.shape:\t ', Test.shapeprint ' split Train as the Response,traindata ' Response,traindata = Np.hsplit (train,[1]) print ' response.shape:\t ', Response.shapeprint ' traindata.shape:\t ', Traindata.shapeprint ' split the test as Response,traindata ' Restest,testData = Np.hsplit (test,[1]) print ' Init the KNN ' KNN = cv2. Knearest () knn.train (traindata,response) print ' Test the KNN ' ret,result,neighbours,dist = Knn.find_nearest (testData,5 print ' The rate: ' correct = np.count_nonzero (result = = restest) accuracy = Correct*100.0/10000print ' accuracy is ', Accuracy, '% ' </span>
Results
Load Datasplit as Train,testtrain.shape:(10000, +) Test.shape:(10000, +) split train as the response, Traindataresponse.shape:(10000, 1) Traindata.shape:(10000, +) split the test as response, Traindatainit the Knntest the knnthe rate:accuracy is 93.22
Data set
http://download.csdn.net/detail/licong_carp/8612383
"PYTHON-OPENCV" KNN English letter Recognition