Machine learning-Support vector machine (SVM)

Last Update:2016-04-07 Source: Internet

Author: User

Tags cos svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A long time ago to learn the SVM, always feel that is not to find the middle of the line, but some ambiguous, real programming time is a lump of paste, parameters randomly test, no discipline. Since this chapter has been re-learned, it is necessary to understand the areas that have not been understood before, do not work hard again ~ algorithm is very simple, if not learn, just because lazy ~ write this paragraph, just to remind yourself

The following images are used in the courseware of Yang Yi teacher of Shanghai Jiaotong University, the website is as follows: http://bcmi.sjtu.edu.cn/~yangyang/ml/

Support Vector machine is a kind of classification method, it is just the name, it looks very complex.

A line in the middle: a coefficient w is required for classification

Support vectors: points on a linear hyper-plane that can be understood as points on both sides of the line

Requirement: The distance between the middle line and the line on the two sides is equal. The number of support vectors (which can be imagined as points on each line on the two lines) <= m +1,m is the dimension of the feature X.

Purpose: Find the parameters W and B of the middle line.

Linear SVM

This figure I looked for a long time, has not understood the y where, according to the formula clearly directly to find all x ah, y = a? y = a-B?

In fact, Y is not an axis here, is classified 0,1,2,..., 1,-1 and the like, all axes are x1,x2,x3,.... That's it!

To figure out the concept, it's good to understand the following:

The distance between the two lines directly take wx1 + B = A and wx2 + B = A subtraction is good (X1 is the point on the top line, X2 is the point on the bottom line), As for God Horse 2r is just the vertical distance, very simple, two point coordinates subtraction is two points between the vector, the membrane is the distance, find two lines The point perpendicular to the category line is OK to pull. The real derivation of this formula is this:

W (x1-x2) =2a

|| w| | || x1-x2| | Cos<w, x1-x2> = 2a

|| x1-x2| | Cos<w, x1-x2> = 2a/| | w| |

The distance is to the left of the formula.

Proof of the first question to understand, here on the line refers to X1 and X2 are in WX + b =0 This line, subtraction is just <w, x1-x2> = 0

Explain:

Ideally, all positive samples (Y=1) in WX + B = A line of the top, all negative samples (y=-1) in WX + b =-a line below, but the two formula is too troublesome, then the y as a positive sign to the front of the good, just the upper and lower two straight line formula to adapt to the synthesis of this: (WX + b) y = A, such points are on-line, but we require positive and negative samples on both sides, so change = >=

Max says it's not enough to just meet the formula below, we need two lines in the middle of the maximum distance

In short: For any point J to make two lines of the largest distance between the W and b

Always with a inconvenient, so we divide the equation on both sides of a, there is a new W and B, no matter, anyway is the symbol, so there is no change.

Since a has become 1, maximize the 2a/| | W | | The Upside is the | | W | | /2, it can also be simplified for the sake of W. W = | | W | | The minimum value of ~

Nonlinear SVM

Everything seems to be progressing very well, however! Real data is likely to come up with some less-than-friendly points!

So, we need to tolerate these mistakes ~

C:tradeoff parameter is actually a factor

#mistake: The number of errors, the point for each error is 1, the correct point is taken 0, and finally added together (the formula in order to make the total error minimum)

C and #mistake are variables that can be combined together to become one, but not easy to understand

#mistake是算出来的, the factor C is based on cross-validation-(cross-validation: I don't know, I'll say it later.)

There is a drawback to the formula above: For those that are not on the outside of the category line are defined as false (i.e., non-black, white, 0/1 loss), without considering the problem of deviation size

The formula changes the #mistake so that the corresponding deviation is calculated for each sample of the error, so that in the formula, all deviations are added to the smallest amount.

The deviation amount is calculated, and the coefficient C is based on cross-validation-(cross-validation). I don't know, I'll say it later.)

Normalized loss variables

Directly set the formula bar.

Hinge loss I want to explain more, because this sample in its scope (such as >=1), then it is actually no loss, that is, all 0 is good, so perhaps this loss function is quite in line with the characteristics of SVM ~

Multi-Classification problem

Method One:

As shown--each time a category is taken out, other categories are synthesized into a large category, which is treated as a two classification problem. Repeat n times to be OK

Cons: The category of the line will be biased to the training data of the smaller category

Method Two: Simultaneous request

Explain the formula:

The left is a point of classification at J XJ multiplied by its own coefficient, which needs to meet W (YJ). XJ + B (YJ) > = 1

Refer to method One, if this point is used in other classification formulas, it is necessary to satisfy W (y '). XJ + b (y ') < = 1

So the two formulas put together are: W (YJ). XJ + B (YJ) > = 1 > = w (y '). XJ + b (y ')

As for the 1~~ that I have to add, I don't know if it's a horse for God, is it the same as the previous formula? 0.0

With the relaxation variable and the loss variable it becomes this:

To go to dinner, the constraints of the best of the next talk ~ haha

Machine learning-Support vector machine (SVM)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More