Support Vector Machine SVM derivation and solution process _

Support Vector Machine SVM derivation and solution process __ machine Learning

Last Update:2018-08-21 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Support Vector Machine (SVM) is an original and not combinatorial classification algorithm with obvious visual geometric meaning, which has high accuracy.

The idea of using SVM algorithm: (1) Simple situation, the linear can divide the problem into a convex optimization problem, which can be simplified by Lagrange multiplier method, and then solved by the existing algorithm, (2) The complex condition, the linearity is not divided, the sample is projected into the high dimension space by the kernel function, so that it becomes linear and can be divided. The kernel function is used to reduce the high latitude computation.

first, the basic concepts related to SVM

Split Super Plane

If C and D are two disjoint convex sets, there are hyperplane p,p that can separate C and D.

The distance of two sets, defined as the shortest distance between the elements of two sets.

Do the Street of Set C and set D shortest segment.

(Image excerpt from July algorithm)

However, how to define the "optimal" split hyperplane of two sets. Finds several points on the collection "boundary" where the direction of the hyperplane is computed as "base", with the average of these points on the two set boundary as the "intercept" of the hyperplane. These points are called support vectors, and the dots are represented by the vector method available.

(Image taken from the July algorithm)

Enter Data

Suppose a training dataset on a given feature space

Where, for the first instance (if n>1, that is, X is multidimensional, has multiple attribute characteristics, at this time the vector);

The class tag for, when +1, is called the positive example, when 1 is called a negative example.

linear scalable Support vector machines

Given a linear separable training dataset, the discrete hyperplane obtained through the interval maximization is the corresponding decision function called linear separable support vector machine . Among them, is a certain characteristic space transformation function, its function is to map X to (higher) dimension, the simplest direct:. In fact, solving the discrete hyperplane problem can be equivalent to solving the convex two programming problem.

Sorting Symbols

Split plane:

Training set:

Target value:

Classification of new data:

Ii. The derivation process of SVM

deriving the objective function

According to the title set

Yes:

w,b Scaling, the value of the t*y is also scaled so that:

Maximum spacer separation hyperplane

Target function:, indicates the nearest point to the line distance as large as possible

(Image taken from the July algorithm)

function interval and geometry interval

Split plane: (function interval)

The function values of the two types of points can always be satisfied by scaling the W in equal proportions .

(Image taken from the July algorithm)

Establish the objective function

1. The method of equal proportional scaling w can always be used to make the function values of two kinds of points satisfy

2. Constraint conditions:

3. Original objective function:

4. New Objective function:

5. The objective function transforms:

6. Lagrange Multiplier method

7. The original problem is a very small problem

The dual problem of the original problem is the minimax problem.

8. The Lagrange function in 6 is biased against W and b respectively and makes it 0:

9. Calculation of Lagrange's even function

10. Continue to seek a great

11. Organize target function: Add minus sign

12. Linear Scalable support vector machine learning algorithm

The calculation results are as follows

13. Classification decision function

three, linear and can not be divided into SVM

1. If the data linearity is not divided, then increases the relaxation factor, causes the function interval plus the relaxation variable is greater than equals 1,

The constraint becomes

Objective function: (This is to ensure that the relaxation factor is not too large)

2. The convex optimization at this time is

3. Lagrange function

4. Convert the three-type into L and get

5. Finishing, get the optimization problem of dual problem

Obtain the optimal solution

6. Calculation

In practice, all the values of the support vector are usually taken as the average, as b*

7. Obtain the separation Super plane

8. The classification decision function is

kernel function : The kernel function can be used to map the original input space to the new feature space, so that the original linear irreducible samples can be divided in the nuclear space.

There are polynomial kernel functions

Gauss kernel function RBF

String kernel function

In practical applications, it is often relied on prior domain knowledge or cross-validation to select effective kernel function. Without more prior information, the Gaussian kernel function is used.

Kernel function Mapping:

(Image taken from the July algorithm)

Gauss Core

(Image taken from the July algorithm)

Thick lines are split over "plane", other lines are contours of y (x), and green punctuate are support vector points.

The Gauss kernel is infinitely dimensional, because

Note: The comparison between SVM and logistic regression: (1) classical SVM, direct output category, no posterior probability, (2) logistic regression, will give the category of the posterior probability, (3) The comparison is the similarities and differences between the objective functions.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More