1. Required Software Download:
(1) LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/)
(2) Python
(3) Gnuplot drawing software (ftp://ftp.gnuplot.info/pub/gnuplot/)
Only the Windows environment is considered here:
1, download LIBSVM Zip package, just unzip to a folder just fine (casually D:\GJS\LIBSVM)
2. Install Python (mine is 2.7.3)
3, download good gnuplot, direct decompression is good, no installation (C:\gnuplot)
2. Data Format description
0 1:5.1 2:3.5) 3:1.4 4:0.2
2 1:4.9 2:3) 3:1.4 4:0.2
1 1:4.7 2:3.2) 3:1.3 4:0.2
[Label] [Index1]:[value1] [index2]:[value2] [Index3]:[value3]
[Label]: category (usually integer) [index n]: Sequential index [value n]
You may need to convert your own training and the format of the test data.
3. How to use
1. Windows cmd Command Window
The downloaded LIBSVM package has been compiled for us (Windows).
Enter libsvm\windows and you can see these EXE files:
1.svm-predict:svmpredict Test_file mode_file output_file in accordance with the already train good model, enter new data, and output the class to predict new data Don't.
2.svm-scale: Sometimes the fluctuation range of the eigenvalues is larger, and the feature data needs to be scaled to the 0--1 (own definition).
3.svm-toy: Seems to be a graphical interface, you can draw points, generate data and so on.
4.svm-train: svmtrain [option] train_file [Model_file] train will accept input in a particular format, producing a model file.
The first step: you can generate your own data, using Svm-toy:
Double-click Svm-toy, click Change to draw a point on the canvas:
Click Run, is actually the process of train, dividing the area:
Click Save to save the data (assuming the saved data is in D://libsvm.txt).
Step Two: Use the training data libsvm.txt to model and use Svm-train:
Use the cmd command to enter the Windows directory in the LIBSVM directory that we unzipped, using Svm-train, as follows:
which
#iter为迭代次数,
Nu is the parameter of the type of kernel function you choose,
OBJ is the minimum value for the two-time solver that the SVM file is converted to.
Rho is the biased item B of the decision function,
NSV is the number of standard support vectors (0<A[I]<C),
NBSV is the number of support vectors on the boundary (A[i]=c),
Total NSV is the overall number of support vectors (for both classes, because there is only one classification model, all NSV = NSV, but for multiple classes, this is the sum of the NSV of each classification model
At the same time, a trained model (Libsvm.txt.model) can be generated in this directory to open the file to view the contents, including some parameters and support vectors, etc.
The third step: using the built model for prediction, using svm-predict
An output file (Libsvm.txt.out) is generated and each row represents the predicted value category for that row.
Parameter optimization:
The parameter optimization of SVM is very important, and the LIBSVM package contains the parameters optimization function, which is mainly the brute force solution parameter. In general, we will use the Gaussian kernel function, which contains two parameters (C and G)
parameter optimization selection using gird.py files :
grid.py in Libsvm/tools, the first need to modify the gird.py in the gnuplot file path problem, the path of the file into a gnuplot storage directory:
Go to the corresponding directory of grid.py and execute grid.py d://libsvm.txt
The first two are the values of C and G, when we retrain the model (plus the parameter C g)
It can be seen that the accuracy has a significant improvement, in fact, these steps can be fully implemented using easy.py, in the same way you need to modify the eays.py inside the gnuplot file path problem, the path of the file into the gnuplot storage directory:
The steps are summarized as follows:
1. Convert the training data to the appropriate format.
2. Sometimes it may be necessary to use Svm-scale to scale the data accordingly, which facilitates the training of modeling.
3. Use grid.py or easy.py for parameter optimization.
4. Use Svm-train modeling and svm-predict to make predictions.
2. Python version uses:
>>> import OS >> > Os.chdir ( ' d://gjs//libsvm//python " ) >>> from svmutil import *>>> Y,x=svm_read_problem ( " d://libsvm.txt " ) >>> M=svm_train (y,x, " -c 8.0 -G 8.0 " ) >>> P_lable,p_ Acc,p_val=svm_predict (y,x,m) Accuracy = 96.1538% (25/26 ) (classification) >>>
>>>ImportOS>>> Os.chdir ('D://gjs//libsvm//python')>>> fromSvmutilImport*>>> Data=svm_problem ([1,-1],[[1,0,1],[-1,0,-1]])#tuple one denotes classification category>>> Param=svm_parameter ('- C 8.0-g 8.0')>>> model=Svm_train (Data,param)>>> svm_predict ([1],[1,1,1],model)>>>svm_predict ([1,-1],[[1,-1,-1],[1,1,1]],model) Accuracy= 0% (0/2) (classification) ([-1.0, 1.0], (0.0, 4.0, 1.0), [[0.0], [0.00033546262790251185]]
Use LIBSVM in 3.weka:
can refer to: http://datamining.xmu.edu.cn/~gjs/project/LibD3C.html
Call LIBSVM in 4.eclipse:
Http://datamining.xmu.edu.cn/~gjs/download/LibSVM.jar
Http://datamining.xmu.edu.cn/~gjs/download/libsvm.jar
Download the two pack LIBSVM packages above and add the appropriate jar packages to the Eclipse project directory:
DataSource Source = new DataSource ("D://iris.arff" ); Classifier clas = LIBSVM (); string[] OPTSVM = weka.core.Utils.splitOptions ("-C 8.0-g 8.0" =source.getdataset (); Data.setclassindex (Data.numattributes () -1); Evaluation eval =new Evaluation (data); Eval.crossvalidatemodel (clas, data, , new Random (1)); System.out.println (Eval.toclassdetailsstring ()); System.out.println (Eval.tosummarystring ()); System.out.println (Eval.tomatrixstring ());
The output is:
5. Use LIBSVM under Linux:
Confirm that Python is installed.
1. wget http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz.
2. tar-zxvf/home/gjs/libsvm.tar.gz.
3. Go to the directory to perform make compilation.
4. ./svm-train/home/gjs/libsvm.txt others are similar.
5. python grid.py/home/gjs/libsvm.txt optimization parameters.
LIBSVM Summary of Usage methods