Support Vector Machine SVM algorithm principle Note 2__SVM

Source: Internet
Author: User
Tags svm

The last blog introduces the principle of SVM algorithm when the sample set is linear and can be divided. Next, there is no question of how to classify the hyperplane correctly, such as "XOR or problem".
For such problems, the sample space can be mapped to a higher dimension space so that the mapped sample is linearly divided. such as {(0,+1), (1,-1), (2,+1)} Three points in a plane are not divided, but mapped to the two-dimensional plane {(0,0,+1), (1,1,-1), (2,0,+1)} Three points are linear and can be divided and so on.
so that φ (x) represents the eigenvector after the x map, the partition hyperplane in the feature space is defined as: F (x) =w^tφ (x) +b. The model that requires the solution then becomes:

The formula needs to compute [φ (x_i)]^tφ (X_j), which is the inner product of a sample mapping to a feature space. Because of its high dimension and even infinite dimension, it is often difficult to calculate, so we introduce the kernel function to express it.
So our model changes to:

With the kernel function, we do not need to define the mapping φ (x) of the sample space to the feature space, nor do we need to compute the inner product of the feature space, so the workload is much less. So what are the requirements of the kernel function?

theorem: To set K (#,#) is a symmetric function defined on the input space of the input space, it is a sufficient and necessary condition for the kernel function: For all samples (x1,x2,..., XN), the kernel matrix of k is always semidefinite.

So for this kind of problem, the choice of kernel function is the crux of the problem, because the kernel function implicitly defines the feature space. Therefore, the selection of kernel function will cause the fractal of the sample to be mapped to the feature space. The commonly used kernel functions are as follows:

At this point, we have introduced two kinds of linear SVM and nonlinear support vector machines, respectively, corresponding to the training samples in the sample space or the characteristic space of the linear can be divided. So how do you respond to situations where a small number of samples cannot be separated correctly, introducing the concept of soft intervals , which allows support vector machines to make errors on certain samples, transforming the constraint conditions in the original model into:
where ε_i is called a relaxation variable, and to pay a price for each slack variable, the final model becomes the following, where C>0 is called the penalty parameter, and C indicates that the sample point penalty for the false classification increases and vice versa. So minimize the objective function has two meanings, 1, the first half of the representation interval as large as possible, 2, the latter part of the error classification of the number of points as small as possible. Called soft interval maximization :

Its dual problem can be expressed as:

and it needs to meet the KKT conditions as follows:

that is, for any sample, there is always α_i=0 or y_i*f (x_i) =1- Ε_i,.
1, if α_i=0, the sample will not have any effect on the final partition of the hyperplane
2, if c>α_i>0, that is y_i*f (x_i) =1-ε_i. The sample is a support vector. And because of c>α_i, so u_i>0,ε_i=0. So the sample is just on the maximum spacing boundary.
3, if c=α_i, then u_i=0, if ε_i<=1, the sample falls within the maximum interval; if ε_i>1, the sample is mistakenly classified.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.