Calculation of __SVM distance of decision distance (function) in SVM

Source: Internet
Author: User
Tags svm

The SVM algorithm in Sklearn uses the Liblinear and LIBSVM two packages, and the model parameters are slightly different.

In Sklearn, SVM is divided into Svc and SVR, and there are four kernel functions as follows, so some models are needed in SVM parameters and some models are not needed.

Linear:.  Polynomial:. is specified by keyword degree by COEF0.  Rbf:. is specified by keyword Gamma, must are greater than 0. Sigmoid (), where is specified by COEF0. The parameters (functions) of SVM model in Sklearn are many, with SVM. Svc.linear, for example, is like this:

Linear. C (regular term penalty factor) linear.degree (polynomial unique, default is 3 times polynomial)

Linear.max_iter (maximum number of hits) linear.random_state

Linear.cache_size (running memory size) linear.dual_coef_ (the coefficients of the dual function, used when calculating the sigmoid probability)

Linear.n_support_   (number of support vectors)             Linear.score
Linear.class_weight   ;           Linear.epsilon                  linear.nu                       Linear.set_params
Linear.class_weight_ &N Bsp          linear.fit (function, for model training)                   &NB Sp  l

Inear.predict (function, for model prediction) Linear.shape_fit_
Linear.classes_ linear.fit_status_ Linear.predict_log_proba (log probability value prediction) linear.shrinking
LINEAR.COEF0 (polynomial/sigmoid use) Linear.gamma

Linear.predict_proba (probability value prediction) Linear.support_ (Support vector index)
Linear.coef_ (each feature weight, that is, the predicted W) linear.get_params

Linear.proba_ linear.probb_ Linear.support_vectors_
Linear.decision_function (function, calculation decision value) Linear.intercept_ (intercept, that is, B,LIBSVM in the forecast is rho)

Linear.tol (Fault-tolerant value of model training) Linear.decision_function_shape Linear.kernel (kernel type)

Linear.probability (True/false, whether the set can calculate the probability value, slightly affect the speed) Linear.verbose

The calculation linear.decision_function of decision function

Because the normal plane of the final prediction of the linear kernel function is y=linear.coef_ * x + linear.intercept_, and the distance from point X to the normal plane is

|linear.coef_ * X + linear.intercept_| / || linear.coef_| |, the decision function is not calculated by distance, but directly:

decision_function = linear.coef_ * X + linear.intercept_, if it is greater than 0, the label is predicted to be 1, otherwise the forecast is 0.

According to the definition of the decision function in Sklearn:


The reason why you can use Linear.coef_ * X + linear.intercept_ directly is because Linear.coef_ is linear.dual_coef_ (for an n-dimensional vector, n is the number of selected support vectors and Linear.support_vectors_ (two-bit vectors [number of support vectors, number of features]), and the final inner product linear.coef_ as a vector of the size equal to the number of features.


for LIBSVM You can refer to the following:

http://my.oschina.net/u/1461744/blog/209104

Recently in the data that involves computing points to support vector machine classification of the distance of the hyperplane, I use the SVM is LIBSVM.

Because is a novice, although looked at some data, but the Chinese and English conversion error and so on causes to often appear the understanding error, therefore to LIBSVM understanding is stumbling. Many bends were made in the process of groping LIBSVM for the meaning of various return values and applying the resulting file.

The first answer to this question is in the FAQ of LIBSVM itself:

Q:How do I get the distance between a point and the hyperplane?

The distance is |decision_value| /|w|. We have |w|^2 = W^TW = alpha^t Q alpha = 2* (dual_obj + sum alpha_i). Thus in Svm.cpp you find where we calculate the dual objective value (i.e., the subroutine Solve ()) and add a Statement to print W^TW.

Q doesn't know what it is, but at least you know it.

DISTANCE=|DECISION_VALUE|/|W|=|DECISION_VALUE|/SQR (2* (Dual_obj+sum (αi))

But, Decision_value is God horse. Where exactly is dual_obj and sum (αi)? Beginners bloggers would like to roar with wood ...

Let's take a look at what you can get from the LIBSVM return value and the result file.

First of all, in the training model (when using the Svm_train () function), you can get these return values at the end (I write Python programs that run on a terminal):

#iter: Number of iterations

Nu: Parameters of the selected kernel function type

OBJ:SVM file is converted to a minimum value of two Solver solutions (mixed with strange things here, yes.) Hidden so deep is obj, look at the distance formula above

Rho: Bias item B for decision function (decision function f (x) =w^t*x+b)

NSV: Number of standard support vectors (0<ΑI<C)

NBSV: Number of support vectors on the boundary (ΑI=C)

Total NSV: Number of support vectors (two is equal to NSV, and multiple classifications are nSV on multiple interfaces)

I'm not going to tell you. These return values are interpreted from here: http://blog.163.com/shuangchenyue_8/blog/static/399543662010328101618513/

Next, look at the model file generated by the training module, which includes the following information:

Svm_type c_svc (type of SVM, default value taken)

Kernel_type RBF (the type of kernel function, the default value taken here)

Gamma 0.0117647 (Parameter gamma)

Nr_class 2 (a few categories, I do here are 2 categories)

TOTAL_SV 1684

rho-0.956377

Label 0 1 (identification of two categories)

NR_SV 1338 346 (How many of the two categories in all support vectors are accounted for)

SV (oh oh, here's the real support vector, see what it looks like)

0.536449657223129 1:39 2:2 3:3 ....

0.3766245470441405 1:11 3:3 4:3 ....

The data format is preceded by a number + space + vector

What is this number in front of the pinch.  It's the alpha we're looking for. The back of the vector is not much to say, is the support vector book, this is different from the training data is its storage format, sparse vector storage mode, the value of 0 of the tick is not recorded, such as here

1:11 3:3 4:3 ....

Actually it's 1:11 2:0 3:3 4:3 ....

So far our obj and α have appeared, and now we have decision_value. In fact, this time we can work out the value of the classification of the hyperplane:

F (x) =w^t*x+b

W=∑αi*yi*αi The green callout's representation vector (see the expression here for the SVM data, good oh http://blog.csdn.net/v_july_v/article/details/7624837), so Alpha knows, b Know (that is, the above rho), the classification of the hyperplane equation is obtained, according to the space midpoint to the plane distance d=|f (x) |/|w|, to examine the point x into, is not. Figure It out ...

But sometimes want to be lazy, do not want to calculate |f (x) | What to do.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.