In-depth analysis of the Python version of the SVM source series (iii)--Calculation of the forecast category of samples

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In series (ii), there is an important code for the SMO algorithm: the prediction category for the computed sample. As follows:

FXi = float (multiply (alphas,labelmat). T* (Datamatrix*datamatrix[i,:]. T) + b  # The Prediction category of sample I

We know that the original prediction class formula is expressed in terms of the parameters W and B of the decision surface, so why does it seem different here?
The original forecast category Calculation formula is:

Where w can be expressed as:

The classification function can then be converted to:

The explanation for this, July blog said more clearly:

The interesting thing about the form here is that for the prediction of the new point x, it is important to compute the inner product of the training data point (<.> vector inner product), which is the basic precondition of using Kernel for non-linear generalization.

The representation of such a child is consistent with the code above.

Here's another phenomenon that can be analyzed: which are support vectors.
A: Alpha is not equal to 0 for the support vector.

The so-called supporting vector is also shown here-in fact, all the supporting vectors correspond to the coefficients alpha is equal to zero, so the inner product calculation for the new point is actually only for a small number of "support vectors" rather than all the training data.
Why does the unsupported vector correspond to alpha equal to zero? Intuitively, this is the "rear" point--as we have analyzed before, it has no effect on the hyperplane, because the classification has a completely hyper-planar decision, so these unrelated points do not participate in the calculation of the classification problem, so there will be no impact.

Note that if Xi is a support vector, the portion of the red color in the upper is equal to 0 (because the functional margin of the support vector equals 1), and for the unsupported vector, the functional margin is greater than 1, so the red color portion is greater than 0, and the alpha _i is also nonnegative, in order to meet the maximum, must alpha_i equal to 0. This is also the limitation of the points of these non supporting vectors.

So, after we run the SMO algorithm, we can find the support vector based on this feature, that is, alpha is not equal to 0.

Reference: http://blog.csdn.net/v_july_v/article/details/7624837

"Machine Learning in Action"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More