Machine Learning support vector machines (supported vectors machine) (update ... )

Source: Internet
Author: User
Tags svm

Support Vector Machine

SVM (Support vector Machines,svms) is a two-class classification model. Its basic model is a linear classifier that defines the largest interval in the feature space, which distinguishes it from the perceptual machine, and the support vector machine also includes the kernel technique, which makes it a substantial nonlinear classifier. The learning strategy of support vector machine is to maximize the interval and form a problem to solve the convex two-time programming (convex quadratic programming).

The SVM learning method consists of constructing a simple-to-complex model: linear-Scalable support vector machines (linear-SVM-linearly separable case), linear support vector machines (linear supported vectors Machine) and non-linear support vector machines (non-linear). The simple model is the foundation of complex model and the special case of complex model. When the training data is linearly separable, by maximizing the hard interval (maximization), a linear classifier is learned, that is, a linear separable support vector machine, also known as a hard interval support vector machine, and when the training data is approximately linearly separable, it is maximized by the soft interval (soft margin Maximization), also learn a linear classifier, linear support vector machine, also known as soft interval support vector machine, when the training data linear non-tick, through the use of nuclear techniques (kernel trick) and maximum soft interval, learning nonlinear support vector machine. By using the kernel function, we can learn the nonlinear support vector machine and learn the linear support vector machine in the high-dimensional feature space by equivalent implicitly. The kernel method (kernel) is a more general machine learning method than support vector machines.

Linear scalable support vector machine and maximum hard interval

Support Vector machine learning is carried out in the feature space. Suppose given a set of training data on a feature space, wherein, for the first I eigenvector, also known as an instance, is a class tag, when =+1, is called a positive example, and when =-1, called a negative example, is called a sample point. It is assumed that the training data set is linear and can be divided.

The goal of learning is to look for a detached hyper plane in the feature space, which can divide the instances into different classes. The separation of the super-plane corresponds to the equation, which is determined by the normal vector and intercept, which can be expressed.

In general, when the training data set is linearly separable, there are infinitely separate hyper-planes that can separate the two types of data correctly. The solution is unique when the linear separable support vector machine is used to maximize the separation of the super plane.

  definition ( linear separable support vector machine ) a given linear separable training data set, through the interval maximization or equivalent to solve the corresponding convex two-time programming problem learning to get the separation of the super-plane is:

and the corresponding classification decision function is called linear can divide support vector machine.

  algorithm ( linear Scalable Support vector machine learning algorithm-Maximum interval method )

Input: Linear can be divided into training data set, wherein,,,;

Output: Maximum interval separation of hyper-planar and categorical decision functions.

(1) Construct and solve constrained optimization problems:

          

            ,

The optimal solution is obtained.

(2) This results in the separation of the super-plane:

          

Categorical decision functions

          

in a linear separable case, the instance of the sample point in the training data set that is closest to the separation of the superelevation plane is called the support vector (supported vectors). the support vector is the point at which the constraint equals an equal sign, i.e.

          

A positive example of a pair, a support vector on a super plane,

To the negative case point, the support vector is on the super plane.

and called interval boundaries.

Only support vectors work when you decide to detach a hyper-plane, while other instance points do not work.

So, how did the above algorithm and how to get it?

To solve the optimization problem of linear scalable support vector machine, it is used as the original optimization problem, the Lagrange duality is applied, and the optimal solution of the original problem (primal problem) is obtained by solving the duality problem (dual problem). This is the dual algorithm of the linear scalable support vector machine (dual problem). In this case, the duality problem tends to be easier to solve, and the other is to introduce the kernel function naturally, and then to generalize the problem of nonlinear classification.

First, the Lagrange function (Lagrange functions) is constructed. For this purpose, the Lagrange multiplier (Lagrange multiplier) is introduced for each inequality constraint, and the Lagrangian function is defined:

        

Which is the Lagrange multiplier vector.

According to Lagrange duality, the duality problem of primal problem is the Minimax problem:

        

Therefore, in order to get the solution of duality problem, we need to find the minimum of w,b, and then ask for the great.

  (1) Ask

The Lagrangian functions are w,b to the partial derivative and equal to 0 respectively.

        

        

Have

        

        

The above two methods are brought into Lagrange function, which is

        

             

That

        

(2) The maximal of the right, which is the duality problem

        

          

, i=1,2,..., N

By converting the above objective function into a very small one, we get the dual optimization problem with the following equivalence:

        

          

, i=1,2,..., N

setting is the solution of the dual optimization problem, there is subscript J, which makes the solution of the original optimization problem can be obtained by the following formula:

        

        

Machine Learning support vector machines (supported vectors machine) (update ... )

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.