After the tragic lesson of handwriting SVM (still too young), I decided to use the Toolbox/third-party library
Python
LIBSVM's GitHub Warehouse
LIBSVM is an open-source SVM implementation that supports C, C + +, Java,python, R, and Matlab, and here you choose to use the Python version.
Installing LIBSVM
Put all the contents of the LIBSVM warehouse into the Python package directory \lib\site-packages or the project catalog.
In the LIBSVM root directory and in the Python subdirectory, create a new __init__.py empty file named, the two empty files will identify the directory in which the Python package can be imported directly.
Allow grass people to spit out the strange solution for switching root directories in various blogs: this and this
Because SVM is often used, the LIBSVM pack is placed in the \lib\site-packages directory. Python interfaces that use LIBSVM in a python interactive environment or in any script import libsvm.python .
Using LIBSVM
LIBSVM is very simple to use, just call a limited interface
Example 1:
from libsvm.python.svmutil import *from libsvm.python.svm import * y, x = [1,-1], [{1:1, 2:1}, {1:-1,2:-1}]prob = svm_problem(y, x)param = svm_parameter(‘-t 0 -c 4 -b 1‘)model = svm_train(prob, param)yt = [1]xt = [{1:1, 2:1}]p_label, p_acc, p_val = svm_predict(yt, xt, model)print(p_label)
Output Result:
optimization finished, #iter = 1nu = 0.062500obj = -0.250000, rho = 0.000000nSV = 2, nBSV = 0Total nSV = 2test:Model supports probability estimates, but disabled in predicton.Accuracy = 100% (1/1) (classification)[1.0]
Download Train1.txt and Test1.txt in the SVM data.
LIBSVM can read training data in a file, which facilitates the use of large-scale data.
Example:
from libsvm.python.svmutil import *from libsvm.python.svm import *y, x = svm_read_problem(‘train1.txt‘)yt, xt = svm_read_problem(‘test1.txt‘)model = svm_train(y, x )print(‘test:‘)p_label, p_acc, p_val = svm_predict(yt[200:202], xt[200:202], model)print(p_label)
You can see the output:
optimization finished, #iter = 5371nu = 0.606150obj = -1061.528918, rho = -0.495266nSV = 3053, nBSV = 722Total nSV = 3053test:Accuracy = 40.809% (907/2225) (classification)
LIBSVM Interface Training Data format
The training data format for LIBSVM is as follows:
<label> <index1>:<value1> <index2>:<value2> ...
Example:
1 1:2.927699e+01 2:1.072510e+02 3:1.149632e-01 4:1.077885e+02
Main types
Saving training data that defines the SVM model
Storage of various parameters required to train the SVM model
SVM Model for complete training
The value of a feature in a model that contains only an integer index and a floating-point value property.
Main interface:
-svm_problem(y, x)
Create Svm_problem objects from training data y,x
There are 3 overloads of the Svm_train:
model = svm_train(y, x [, ‘training_options‘])model = svm_train(prob [, ‘training_options‘])model = svm_train(prob, param)
For training Svm_model Models
Creates a Svm_parameter object with the argument as a string.
Example:
param = svm_parameter(‘-t 0 -c 4 -b 1‘)
Call Syntax:
p_labs, p_acc, p_vals = svm_predict(y, x, model [,‘predicting_options‘])
Parameters:
yLabel for test data
xInput vectors for test data
modelA well-trained SVM model.
return value:
p_labsis a list of stored forecast tags.
p_accThe predicted accuracy, mean, and regression squared correlation coefficients are stored.
p_valsThe decision factor (the degree of reliability of the decision) is returned when the parameter '-B 1 ' is specified.
This function is not only an interface for testing, but also an interface for classifying in the application state. The more wonderful is the need to enter the test label y to make predictions, because Y does not affect the prediction results can be replaced with 0 vectors.
Read training data in LIBSVM format:
y, x = svm_read_problem(‘data.txt‘)
Store the trained Svm_model in a file:
svm_save_model(‘model_file‘, model)
Content of Model_file:
Read the Svm_model stored in the file:
model = svm_load_model(‘model_file‘)
Tuning SVM Parameters
LIBSVM requires a series of parameters to adjust the control during the training and prediction process.
Parameters of the Svm_train:
Here are the options to adjust the parameters in the SVM or kernel function:
-dAdjust kernel function's degree parameter, default is 3
-gAdjusts the gamma parameter of the kernel function, which defaults to1/num_features
-rAdjusts the COEF0 parameter of the kernel function, which defaults to0
-cAdjust the cost parameter in C-svc, Epsilon-svr, and Nu-svr, which defaults to1
-nAdjust Nu-svc, one-class the error rate in SVM and nu-svr nu parameter, default to0.5
-pAdjust the epsilon parameter in the loss function of the EPSILON-SVR, default0.1
-mAdjusts the internal buffer size, in megabytes, by default100
-eAdjust termination criteria, default0.001
-wiAdjust the cost parameters of the I feature in C-svc
Options for adjusting the algorithm's functionality:
-bWhether to estimate the correct probability, value 0-1, the default is0
-hWhether to use the shrink heuristic (shrinking heuristics), the value 0-1, the default is0
-vCross Check
-qSilent mode
Matlab
LIBSVM's MATLAB interface usage is similar, MATLAB Rich Standard Toolbox provides various conveniences.
The Statistic Tools Toolbox provides Svmtrain and svmclassify functions for SVM classification.
traindata = [0 1; -1 0; 2 2; 3 3; -2 -1;-4.5 -4; 2 -1; -1 -3];group = [1 1 -1 -1 1 1 -1 -1]‘;testdata = [5 2;3 1;-4 -3];svm_struct = svmtrain(traindata,group); Group = svmclassify(svm_struct,testdata);
Svmtrain accepts the Traindata and group two parameters, Traindata represents a sample in one line, and group is the result of the classification corresponding to the sample in Traindata, denoted by 1 and 1.
Svmtrain returns a structural body svm_struct that stores the parameters required for the trained SVM.
Svmclassify accepts Svm_struct and represents a testdata of a sample in a row, and returns the result of the classification as a vector of 1 and 1 columns.
LIBSVM for Python uses