Use LIBSVM to experiment with the Mnist dataset---easy to understand!

Source: Internet
Author: User
Tags svm

Original: http://blog.csdn.net/arthur503/article/details/19974057

Learning the experimental link in SVM, the teacher introduced the use of LIBSVM. It was easy to say something after reading it.

1. LIBSVM Introduction

Although the principle requires very high mathematics knowledge and so on, but in the LIBSVM, is completely a tool bag, can use. At that time asked several times the teacher, the company to do SVM is so simple? Knock a few command lines on it ... That seems to be the case. Of course, in the context of big data, there will be such as: parallel SVM, multi-core function SVM and other situations of research and application.

The data given by the teacher in the experiment is very simple, it is 1000 data points, using SVM to classify. There's not much to say. So I want to try to do handwriting font classification detection, similar to the car license recognition after the photo-violation. Searching the web for a dataset as a Mnist dataset is a basic data set for getting started.

Introduction and use of LIBSVM for reference: LIBSVM introduction. However, Svm-toy is supported for up to three classifications, not just two classifications.

Use the Svm-train.exe,svm-predict.exe command under the Windows folder to model and predict, and see the documentation for specific parameters.

The main optional parameters of the Svm-train are:

-S Select the SVM type. The general election is C-SVM.

-C Select the weight factor for the relaxation variable. The larger the C, the greater the penalty for relaxation variables, the smaller the spacing between the two support vector lines, the more accurate the model, the less tolerant the noise data points, the easier it is to fit, and the smaller the distance between the two support vectors, the greater the tolerance for noise and the better the final effect. However, the more data points the model is divided into, the more likely it is to be less than fit.

-T Select the kernel function. Linear and RBF are generally compared, linear speed is faster, but the data can be divided, RBF is more general, the default selection of RBF.

-G Garma coefficient. is exp (-gamma*|u-v|^2), equivalent to gamma=1/(2τ^2). Tau represents the width of the Gaussian function, and g is inversely proportional to tau. The larger the G, the smaller the tau, the narrower the Gaussian function, the smaller the coverage product, and the more support vectors are needed, the more complex the model is, and the easier it is to fit.

-wi the weight distribution of the sample classification. Because, in the classification, some classifications may be more important.

-V cross-validate parameters. Used to do cross-examination.

Svm-predict has only one optional parameter and is generally not used.

2. Data processing from the Mnist official website download, decompression, according to the data format of the byte data read, extracted to the train and test image grayscale data. The images are 28*28 pixels. Among them, the train data is 60000, the test data 10000. The first test of 1000 data using SVM, the results found that the effect is very poor! Only about 11% of the correct rate. After examining and experimenting, it is found that the raw data is not scale, which may result in too large data gap, which will affect the result. The experimental records are as follows: Using SVM in the 10 classification of Mnist, in the case of not scale the image grayscale data, namely: directly using the image of the pixel value modeling, and finally get only about 11% of the correct rate, equivalent to One-tenth. Check the Predict results to verify that the predict are predicted to be 1 (this is almost exactly one-tenth of the correct rate). Therefore, if the guessing data is too large, the performance of SVM will be seriously affected by not performing scale. After reading the LIBSVM document, the image grayscale data ScaleTo [0,1], and then use the small dataset test to get the 80%+ 's correct rate.
With c=2, other parameters are modeled on the Train_60k_scale.txt data set by default, and the Test_10k_scale.txt test data set is validated to get a 95.02% accuracy.
Use the./tools/grid.py method (to modify the content parameters, see: LIBSVM usage Introduction), use the method in the document to test data for 1k, both C and G are based on ( -10,10,1) parameters to find the optimal parameters.(grid.py is actually used Cross-validationLaw to find) and eventually get Optimal ParametersAs: c=4.0 g=0.015625 rate=91.1. According to this parameter, the SVM model is trained using train_60k_scale.txt data set, and the Test_10k_scale.txt test data set is validated, and the accuracy rate is 98.46%! The final trained SVM model parameters are as follows: Svm_type c_svc
Kernel_type RBF
Gamma 0.015625
Nr_class 10
TOTAL_SV 12110
rho-0.409632-0.529655-0.842478-0.567781-0.125654-0.34742-0.696415-0.191642-1.4011-0.0458988-0.303381 0.0614391 0.420461 0.266255-0.0264913 0.0878689 0.0784119 0.167691 0.0910791 0.577181 0.395401 0.0896789 0.381334 0.134266-0.01373 03 0.902749 0.779241 0.120543 0.203025-0.523485 0.3886 0.468605-0.14921 1.10158-0.320523-0.120132-0.656063-0.44432- 0.925911-0.421136-0.176363-1.16086 0.0610109 0.0764374-0.192982
Label 5 0 4 1 9 2 3 6 7 8
NR_SV 1466 843 1229 516 1531 1419 1373 948 1101 1684 as you can see, in these 60,000 training model samples, the final support vectors used are 12,110. 3. Model interpretation for parameter interpretation in support vector models, using the results of the two classification is better explained, as follows: Svm_type c_svc
Kernel_type linear using a linear classifier
Nr_class 22 Categories
TOTAL_SV 15 number of support vectors
Rho 0.307309
Label 1-1
NR_SV 8 7 Number of support vectors (SV) For positive and negative classes
Sv
1 1:7.213038 2:0.198066
1 1:-4.405302 2:0.414567
1 1:8.380911 2:0.210671
1 1:3.491775 2:0.275496
1 1:-0.926625 2:0.220477
1 1:-2.220649 2:0.406389
0.4752011717540238 1:1.408517 2:0.377613
0.4510429211309505 1:-8.633542 2:0.546162
-1 1:8.869004 2:-0.343454
-1 1:7.263065 2:-0.239257
-1 1:-4.2467 2:0.057275
-0.9262440928849748 1:0.755912 2:-0.225401
-1 1:-9.495737 2:-0.027652
-1 1:9.100554 2:-0.297695
-1 1:-3.93666 2:-0.047634
Support vectors are divided into three types: for positive class data: C (that is, the value set by the parameter-c:c) represents the support vector within the boundary, the 0<x<c represents the support vector on the boundary (that is, the support vector between wx+b=±1 and wx+b=0). The same is the case with negative data. Support vector machines are mainly based on these two types of support vectors to build the model. For the third type of data, which is the wrong data, their position is outside the plane of the support vector, that is, in another area, and |wx+b|>1. This type of point is not present in the training data and therefore does not appear in the support Vector sv.

Use LIBSVM to experiment with the Mnist dataset---easy to understand!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.