Machine learning-Support vector machine (SVM)

Source: Internet
Author: User
Tags cos svm

A long time ago to learn the SVM, always feel that is not to find the middle of the line, but some ambiguous, real programming time is a lump of paste, parameters randomly test, no discipline. Since this chapter has been re-learned, it is necessary to understand the areas that have not been understood before, do not work hard again ~ algorithm is very simple, if not learn, just because lazy ~ write this paragraph, just to remind yourself


The following images are used in the courseware of Yang Yi teacher of Shanghai Jiaotong University, the website is as follows: http://bcmi.sjtu.edu.cn/~yangyang/ml/


Support Vector machine is a kind of classification method, it is just the name, it looks very complex.

A line in the middle: a coefficient w is required for classification

Support vectors: points on a linear hyper-plane that can be understood as points on both sides of the line

Requirement: The distance between the middle line and the line on the two sides is equal. The number of support vectors (which can be imagined as points on each line on the two lines) <= m +1,m is the dimension of the feature X.

Purpose: Find the parameters W and B of the middle line.


Linear SVM


This figure I looked for a long time, has not understood the y where, according to the formula clearly directly to find all x ah, y = a? y = a-B?

In fact, Y is not an axis here, is classified 0,1,2,..., 1,-1 and the like, all axes are x1,x2,x3,.... That's it!

To figure out the concept, it's good to understand the following:

The distance between the two lines directly take wx1 + B = A and wx2 + B = A subtraction is good (X1 is the point on the top line, X2 is the point on the bottom line), As for God Horse 2r is just the vertical distance, very simple, two point coordinates subtraction is two points between the vector, the membrane is the distance, find two lines The point perpendicular to the category line is OK to pull. The real derivation of this formula is this:

W (x1-x2) =2a

|| w| | || x1-x2| | Cos<w, x1-x2> = 2a

|| x1-x2| | Cos<w, x1-x2> = 2a/| | w| |

The distance is to the left of the formula.

Proof of the first question to understand, here on the line refers to X1 and X2 are in WX + b =0 This line, subtraction is just <w, x1-x2> = 0


Explain:

Ideally, all positive samples (Y=1) in WX + B = A line of the top, all negative samples (y=-1) in WX + b =-a line below, but the two formula is too troublesome, then the y as a positive sign to the front of the good, just the upper and lower two straight line formula to adapt to the synthesis of this: (WX + b) y = A, such points are on-line, but we require positive and negative samples on both sides, so change = >=

Max says it's not enough to just meet the formula below, we need two lines in the middle of the maximum distance

In short: For any point J to make two lines of the largest distance between the W and b


Always with a inconvenient, so we divide the equation on both sides of a, there is a new W and B, no matter, anyway is the symbol, so there is no change.

Since a has become 1, maximize the 2a/| | W | | The Upside is the | | W | | /2, it can also be simplified for the sake of W. W = | | W | | The minimum value of ~


Nonlinear SVM

Everything seems to be progressing very well, however! Real data is likely to come up with some less-than-friendly points!


So, we need to tolerate these mistakes ~


C:tradeoff parameter is actually a factor

#mistake: The number of errors, the point for each error is 1, the correct point is taken 0, and finally added together (the formula in order to make the total error minimum)

C and #mistake are variables that can be combined together to become one, but not easy to understand

#mistake是算出来的, the factor C is based on cross-validation-(cross-validation: I don't know, I'll say it later.)

There is a drawback to the formula above: For those that are not on the outside of the category line are defined as false (i.e., non-black, white, 0/1 loss), without considering the problem of deviation size


The formula changes the #mistake so that the corresponding deviation is calculated for each sample of the error, so that in the formula, all deviations are added to the smallest amount.

The deviation amount is calculated, and the coefficient C is based on cross-validation-(cross-validation). I don't know, I'll say it later.)


Normalized loss variables


Directly set the formula bar.

Hinge loss I want to explain more, because this sample in its scope (such as >=1), then it is actually no loss, that is, all 0 is good, so perhaps this loss function is quite in line with the characteristics of SVM ~


Multi-Classification problem


Method One:

As shown--each time a category is taken out, other categories are synthesized into a large category, which is treated as a two classification problem. Repeat n times to be OK

Cons: The category of the line will be biased to the training data of the smaller category


Method Two: Simultaneous request

Explain the formula:

The left is a point of classification at J XJ multiplied by its own coefficient, which needs to meet W (YJ). XJ + B (YJ) > = 1

Refer to method One, if this point is used in other classification formulas, it is necessary to satisfy W (y '). XJ + b (y ') < = 1

So the two formulas put together are: W (YJ). XJ + B (YJ) > = 1 > = w (y '). XJ + b (y ')

As for the 1~~ that I have to add, I don't know if it's a horse for God, is it the same as the previous formula? 0.0


With the relaxation variable and the loss variable it becomes this:




To go to dinner, the constraints of the best of the next talk ~ haha


Machine learning-Support vector machine (SVM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.