The statistical learning method consists of a model + strategy + algorithm , which constructs a statistical learning method (for example, a support vector machine), which is actually to determine the three elements specifically.
1 Support Vector Machine
SVM (Support vector machines) is a kind of binary classification model.
1) Basic model
Defined on a feature space, one of the largest, linear classifiers (linear classifier) of an interval (margin)
2) Learning Strategies (strategy)
The maximum interval can be transformed into the problem of solving convex two-times planning.
3) Learning algorithm (algorithm)
Optimization algorithm for convex two-times programming.
The sample data for training can be divided into three categories: the first class is linear, the second is approximately linear, and the third is linearly non-divided.
The SVM for three sample data are: linear (hard interval maximization), linear (maximum soft interval), nonlinearity (kernel technique + soft interval maximization).
This article mainly introduces, linear can be divided into support vector machine. For the sake of convenience, the vector machine or SVM mentioned below refers to the linear sub-support vector machine.
2 Basic Concepts
2.1 Super Plane (hyperplane)
In n-dimensional European space, the residual dimension is equal to 1 (also known as n-1 dimension) linear subspace, called the super-plane.
The hyper plane is a straight line in two-dimensional space, which is planar in three-dimensional space and can be used to separate data. as shown, the Hyper-plane (line) separates two different types of data (dots and square points).
If the data point is recorded as x (n-dimensional vector), the equation for the hyper-plane is $\ f (x) = \beta_{0} + \beta^{t} x = 0\; $, where $\beta $ is a weight vector (some books are called "normal vectors")
Explanation: The unit normal vector of $\beta^{*}$ (Green Line) in the right image is $\ \beta^{*} = \dfrac{\beta}{| | \beta| |} $, the distance from any point in the plane X to the superelevation plane is $\ r = \dfrac{|\beta_{0} + \beta^{t} x|} {|| \beta| |} $
Also attached: Planar coordinates, a point $\;(x_{0}, y_{0}) \;$ to line $\;(Ax + by + C = 0) \;$ distance is $\; D = \dfrac{ax_{0} + by_{0} + c}{\sqrt{a^{2} + b^{2}}} $
2.2 Support vectors (supported vector)
If the output Y is +1 and-1 respectively, representing two different categories, then for X, the corresponding f (x) has three possible values:
1) When on a hyper-plane (also on a straight line in the diagram), $ f (x) = \beta_{0} + \beta^{t} x = 0 $
2) $f (x) = \beta_{0} + \beta^{t} x \leq-1$ when on the left side of the hyper-plane
3) $f (x) = \beta_{0} + \beta^{t} x \geq when on the right side of the super plane +1$
Assuming that there is a hyper-plane that can classify n sample data correctly, then for any one sample data $\;(x_{i}, Y_{i}) $, the following constraints are met:
$\quad y_{i} (\beta^{t} x_{i} + \beta_{0}) \geq 1, i = 1, 2, ..., N $
As shown, the equals sign in the nearest three sample points of the hyper plane, making 2) and 3, are called "Support vectors".
2.3 Geometry interval (geometric margin)
Because the support vectors make 2 and 3) The equals sign is formed, so they go to the super-plane distance:
$\quad r = \dfrac{|\beta_{0} + \beta^{t} x|} {|| \beta| |} = \dfrac{1}{| | \beta| |} $
Two different kinds of support vectors (with values of +1 and-1 respectively), and the sum of distances to the hyper-plane:
$\quad r^{'} = \dfrac{2}{| | \beta| |} \;$, $r ^{'}\;$ called "geometric interval" (geometric margin)
A point distance from the super-plane, can be used to indicate the accuracy and confidence of the classification results.
Visually, the closer the super-plane is to the positive middle of two kinds of sample data (i.e., the farther the distance between the two types of data points to the super-plane), the higher the accuracy and certainty of the classification results.
2.4 Learning Algorithms
SVM's learning algorithm (or maximum interval method), is based on the given sample data, to find a "maximum interval" of the super-plane, the different kinds of samples separated.
That is, the value of the $r ^{'}$ is maximized when the "constraints" are met:
$\quad \max \limits_{\beta,\; \beta_{0}} \dfrac{2}{| | \beta| |} \quad subject\;to \quad Y_{i} (\beta^{t} x_{i} + \beta_{0}) \geq 1, i = 1, 2, ..., N $
Or, maximize the $r ^{'}$, which is equivalent to minimizing $| | \beta| | ^{2}$, as shown below:
$\quad \min \limits_{\beta,\;\beta_{0}} \dfrac{1}{2} | | \beta| | ^{2} \quad subject \; To \quad Y_{i} (\beta^{t} x_{i} + \beta_{0}) \geq 1, i = 1, 2, ..., N $
3 OpenCV function
The implementation of SVM in OpenCV is based on LIBSVM, and its basic process is to create the SVM model----Set the relevant parameters----Train the sample data training--Forecast
1) Create a model
Static Ptr<svm> cv::ml::svm::create (); // Create an empty model
2) Setting parameters
Virtual voidCv::ml::svm::settype (intVal);//set the type of SVM, default to Svm::c_svc
Virtual voidCv::ml::svm::setkernel (intKerneltype);//Set the kernel function type, this paper is a linear kernel function, set to Svm::linear
Virtual voidCv::ml::svm::settermcriteria (ConstCv::termcriteria & Val);//Set Iteration Termination criteria
Cv::termcriteria::termcriteria (
inttype, // guideline Types
intMaxCount, // Maximum number of iterations
DoubleEpsilon // target accuracy
);
3) Training (train)
Virtual BOOL Cv::ml::statmodel::train (inputarray samples, //Training sample int //Training Sample for " Row sample "Row_sample or" column sample "Col_sample inputarray responses // classification result of the corresponding sample data )
4) forecast (predict)
The Predict function can be used to predict the response of a new sample, with each of its parameters as follows:
Virtual float
Inputarray samples, //Input sample data, floating-point matrix
Outputarray //output matrix, not output by default
int 0 // flag, default is 0
) Const
4 Code Examples
The following is the official routine in OpenCV 3.0, which modifies the training sample data and removes the function that gets the support vector.
#include <opencv2/core.hpp>#include<opencv2/imgproc.hpp>#include<opencv2/highgui.hpp>#include<opencv2/ml.hpp>using namespaceCV;using namespacecv::ml;intMain () {//5,120 x Matrix intwidth = +, height = +; Mat Image=Mat::zeros (height, width, cv_8uc3); //Training Samples floattrainingdata[6][2] = { { -, -},{245, +},{480, -},{ the,380},{ -, -},{ -, -} }; intlabels[6] = {-1,1,1,1,-1,1}; //Each sample data corresponds to the output, because it is a binary model, so the output is +1 or 1 Mat Trainingdatamat (6,2, CV_32FC1, Trainingdata); Mat Labelsmat (6,1, CV_32SC1, labels); //Training SVMPtr<svm> SVM =svm::create (); SVM-SetType (SVM::C_SVC); SVM-Setkernel (svm::linear); SVM->settermcriteria (Termcriteria (Termcriteria::max_iter, -, 1e-6)); SVM-train (Trainingdatamat, Row_sample, Labelsmat); //showing the results of two categoriesVEC3B Green (0,255,0), Blue (255,0,0); for(inti =0; i < image.rows; ++i) for(intj =0; J < Image.cols; ++j) {Mat Samplemat= (mat_<float> (1,2) <<J, I); floatResponse = svm->predict (Samplemat); if(Response = =1) image.at<Vec3b> (i, j) =Blue; Else if(Response = =-1) image.at<Vec3b> (i, j) =Green; }
//Draw the training sample data intThickness =-1; intLinetype =8; Circle (Image, point ( -, -),5, Scalar (0,0,0), thickness, linetype); Circle (Image, point (245, +),5, Scalar (255,255,255), thickness, linetype); Circle (Image, point (480, -),5, Scalar (255,255,255), thickness, linetype); Circle (Image, point ( the,380),5, Scalar (0,0,255), thickness, linetype); Circle (Image, point ( -, -),5, Scalar (255,255,255), thickness, linetype); Circle (Image, point ( -, -),5, Scalar (0,0,255), thickness, linetype); Imwrite ("Result.png", image);//Save the results of the trainingImshow ("SVM Simple Example", image); Waitkey (0);}
OpenCV 3.0, the function that gets the support vector is getsupportvectors(), in fact, when the kernel function is set to svm::linear , the function does not get the support vector, which is the flaw of the 3.0 version.
For this, a new function was used in version 3.1 to get the support vector, i.e. getuncompressedsupportvectors ()
int thickness =2; int Linetype=8; Mat SV= svm->getuncompressedsupportvectors (); for(inti =0; i < sv.rows; ++i) { Const float* v = sv.ptr<float>(i); Circle (Image, point (int) v[0], (int) v[1]),6, Scalar ( -, -, -), thickness, linetype);}
As shown in the actual operating results, according to experience, the three white dots near the hyper-plane are called "support vectors".
Resources:
< machine learning > Zhou Zhijun 6th
< Statistical learning method > Hangyuan Li 7th Chapter
<the Elements of statistical learning_2nd> CH 4.5, CH 12
"Support Vector machine series" Pluskid
OpenCV 3.0.0 Tutorials
"LIBSVM--A Library for support Vector machines"
OpenCV Support Vector Machine (i)