Machine learning Techniques (1)--linear support Vector machines

Last Update:2016-09-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Linear support vector machines.

In this classification problem, we need to choose the most "fat" line, and the fattest line is the margin largest line.

Our optimization goal is to maximize this margin. That is, the distance from each point to the line is minimized. How is this distance calculated?

In order to not be confused, W0 will not be the synthesis vector w, in addition to take a new name B. Similarly, X0=1 does not have to be stuffed into the x vector.

So the result: h (x) =sign (wtx+b), while the distance distance (x,b,w) satisfies wTx ' +b=0

X ' and X ' are two points on the plane, (x ' –x ') is a vector on the plane, if the vector on this plane and WT multiply the result is 0, then it can be concluded that WT is perpendicular to this plane, WT is the normal vector of this plane.

This allows you to push the formula for the distance:

This line also has to be two yuan for Y to classify, that is, for each point (Xn,yn), this line can reach Wtxn+b and yn the same number. Use the character of the same sign to remove the absolute value symbol.

In combination with the previous formula, our optimization goal is to:

Margin (b,w) indicates that the point-to-line distance is determined by B and W, and our goal is to maximize it, that is, to find the "fattest" B and w from the point-to-line distance;

Every yn (wtxn+b) >0 indicates that each point (Xn,yn) can have a line separating them;

With some scaling techniques, the value of yn (wtxn+b) is 1, so the margin (b,w) is simplified.

The margin is then simplified into:

The restrictions were relaxed by yn (wtxn+b) =1 into yn (wtxn+b) ≥1, proving that our solution would not have any effect even if the conditions were relaxed.

Our ultimate goal is to optimize for:

The maximization of a target can be removed from the reciprocal, replaced by the smallest target, for the convenience of the calculation, and then multiply by a 1/2. This problem is called the standard problem.

First take a few specific examples to see the situation:

After drawing:

The point with the block in the figure is the support vector. Only they can decide what the line looks like, and none of the other points matter.

So, what if the generalize question should be solved?

Use quadratic programming to solve!

Additional knowledge: https://en.wikipedia.org/wiki/Quadratic_programming

Quadratic programming (QP) is a special type of mathematical optimization problem--specifically, the problem of optimizing (either minimzing or maximizing) a quadratic function of several variables subject to linear constraints on these variabl Es.

Problem formulation:

The quadratic programming problem with n variables and m constraints can formulated as follows. Given:

A real-valued, n-dimension vector c;

An nxn-dimensional real symmetric matrix Q;

An mxn-dimensional real matrix A;

An m-dimensional real vector b,

The objective of quadratic programming is to find a n-dimensional vector x, that would

Minimize 1/2*Q*XTX + cTx, subject to Ax≤b

The notation ax≤b means that every entry of the vector Ax are less than or equal to the corresponding entry of the vector b .

To change the equation we want to solve into a standard two-time programming problem style, the QP program will give me an answer.

So summarize the solution process:

Hard-margin means that all points are strictly separated. Without this strict line, there is no solution.

If you raise the line dimension to a hyperplane:

SVM can be seen as a regularization explanation:

On the other hand, SVM by adjusting "fat thin" size, in fact, can indirectly do VC dimension adjustment.

We use SVM to control the complexity of the model when we raise the dimension of the linearly irreducible data to a linear fractal dimension by φ.

Summarize:

Machine learning Techniques (1)--linear support Vector machines

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine learning Techniques (1)--linear support Vector machines

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support