The SVM algorithm in Sklearn uses the Liblinear and LIBSVM two packages, and the model parameters are slightly different.
In Sklearn, SVM is divided into Svc and SVR, and there are four kernel functions as follows, so some models are needed in SVM parameters and some models are not needed.
Linear:. Polynomial:. is specified by keyword degree by COEF0. Rbf:. is specified by keyword Gamma, must are greater than 0. Sigmoid (), where is specified by COEF0. The parameters (functions) of SVM model in Sklearn are many, with SVM. Svc.linear, for example, is like this:
Linear. C (regular term penalty factor) linear.degree (polynomial unique, default is 3 times polynomial)
Linear.max_iter (maximum number of hits) linear.random_state
Linear.cache_size (running memory size) linear.dual_coef_ (the coefficients of the dual function, used when calculating the sigmoid probability)
Linear.n_support_ (number of support vectors) Linear.score
Linear.class_weight ; Linear.epsilon linear.nu Linear.set_params
Linear.class_weight_ &N Bsp linear.fit (function, for model training) &NB Sp l
Inear.predict (function, for model prediction) Linear.shape_fit_
Linear.classes_ linear.fit_status_ Linear.predict_log_proba (log probability value prediction) linear.shrinking
LINEAR.COEF0 (polynomial/sigmoid use) Linear.gamma
Linear.predict_proba (probability value prediction) Linear.support_ (Support vector index)
Linear.coef_ (each feature weight, that is, the predicted W) linear.get_params
Linear.proba_ linear.probb_ Linear.support_vectors_
Linear.decision_function (function, calculation decision value) Linear.intercept_ (intercept, that is, B,LIBSVM in the forecast is rho)
Linear.tol (Fault-tolerant value of model training) Linear.decision_function_shape Linear.kernel (kernel type)
Linear.probability (True/false, whether the set can calculate the probability value, slightly affect the speed) Linear.verbose
The calculation linear.decision_function of decision function
Because the normal plane of the final prediction of the linear kernel function is y=linear.coef_ * x + linear.intercept_, and the distance from point X to the normal plane is
|linear.coef_ * X + linear.intercept_| / || linear.coef_| |, the decision function is not calculated by distance, but directly:
decision_function = linear.coef_ * X + linear.intercept_, if it is greater than 0, the label is predicted to be 1, otherwise the forecast is 0.
According to the definition of the decision function in Sklearn:
The reason why you can use Linear.coef_ * X + linear.intercept_ directly is because Linear.coef_ is linear.dual_coef_ (for an n-dimensional vector, n is the number of selected support vectors and Linear.support_vectors_ (two-bit vectors [number of support vectors, number of features]), and the final inner product linear.coef_ as a vector of the size equal to the number of features.
for LIBSVM You can refer to the following:
http://my.oschina.net/u/1461744/blog/209104
Recently in the data that involves computing points to support vector machine classification of the distance of the hyperplane, I use the SVM is LIBSVM.
Because is a novice, although looked at some data, but the Chinese and English conversion error and so on causes to often appear the understanding error, therefore to LIBSVM understanding is stumbling. Many bends were made in the process of groping LIBSVM for the meaning of various return values and applying the resulting file.
The first answer to this question is in the FAQ of LIBSVM itself:
Q:How do I get the distance between a point and the hyperplane?
The distance is |decision_value| /|w|. We have |w|^2 = W^TW = alpha^t Q alpha = 2* (dual_obj + sum alpha_i). Thus in Svm.cpp you find where we calculate the dual objective value (i.e., the subroutine Solve ()) and add a Statement to print W^TW.
Q doesn't know what it is, but at least you know it.
DISTANCE=|DECISION_VALUE|/|W|=|DECISION_VALUE|/SQR (2* (Dual_obj+sum (αi))
But, Decision_value is God horse. Where exactly is dual_obj and sum (αi)? Beginners bloggers would like to roar with wood ...
Let's take a look at what you can get from the LIBSVM return value and the result file.
First of all, in the training model (when using the Svm_train () function), you can get these return values at the end (I write Python programs that run on a terminal):
#iter: Number of iterations
Nu: Parameters of the selected kernel function type
OBJ:SVM file is converted to a minimum value of two Solver solutions (mixed with strange things here, yes.) Hidden so deep is obj, look at the distance formula above
Rho: Bias item B for decision function (decision function f (x) =w^t*x+b)
NSV: Number of standard support vectors (0<ΑI<C)
NBSV: Number of support vectors on the boundary (ΑI=C)
Total NSV: Number of support vectors (two is equal to NSV, and multiple classifications are nSV on multiple interfaces)
I'm not going to tell you. These return values are interpreted from here: http://blog.163.com/shuangchenyue_8/blog/static/399543662010328101618513/
Next, look at the model file generated by the training module, which includes the following information:
Svm_type c_svc (type of SVM, default value taken)
Kernel_type RBF (the type of kernel function, the default value taken here)
Gamma 0.0117647 (Parameter gamma)
Nr_class 2 (a few categories, I do here are 2 categories)
TOTAL_SV 1684
rho-0.956377
Label 0 1 (identification of two categories)
NR_SV 1338 346 (How many of the two categories in all support vectors are accounted for)
SV (oh oh, here's the real support vector, see what it looks like)
0.536449657223129 1:39 2:2 3:3 ....
0.3766245470441405 1:11 3:3 4:3 ....
The data format is preceded by a number + space + vector
What is this number in front of the pinch. It's the alpha we're looking for. The back of the vector is not much to say, is the support vector book, this is different from the training data is its storage format, sparse vector storage mode, the value of 0 of the tick is not recorded, such as here
1:11 3:3 4:3 ....
Actually it's 1:11 2:0 3:3 4:3 ....
So far our obj and α have appeared, and now we have decision_value. In fact, this time we can work out the value of the classification of the hyperplane:
F (x) =w^t*x+b
W=∑αi*yi*αi The green callout's representation vector (see the expression here for the SVM data, good oh http://blog.csdn.net/v_july_v/article/details/7624837), so Alpha knows, b Know (that is, the above rho), the classification of the hyperplane equation is obtained, according to the space midpoint to the plane distance d=|f (x) |/|w|, to examine the point x into, is not. Figure It out ...
But sometimes want to be lazy, do not want to calculate |f (x) | What to do.