Machine Learning Summary: SVM

Last Update:2018-12-05 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first contact with SVM was still four years ago. At that time, it was used for handwritten digital recognition. Based on some books and literature, MATLAB was used to extract the PCA + SVMCode, The recognition rate is normal, 90 is not on, sorry to say hello to people. Most importantly, when I attended an interview, I was asked that Shenma is a support vector and I couldn't answer it. After being a graduate student, I learned this classic story repeatedly in various classes related to pattern recognition and machine learning.AlgorithmEvery time, I have a new experience. Take this opportunity to make a summary.

SVM is a linear classifier. It also targets simple supervised learning problems: Given m samples (X (I), y (I), y (I) = +/-1, determine a linear classification surface. This problem can be solved in multiple ways: sensor, Fisher linear discriminant analysis, and logistic regression. The implementation and principles of different algorithms are different: the sensor algorithm uses an iterative method to sequentially process samples and adjust the coefficients of Linear Classification surface functions according to the new samples, all training samples are correctly divided to complete iteration. The Fisher linear discriminant analysis method follows the criteria of Low intra-class discretization and high inter-class divergence to obtain a linear classifier; logistic regression considers y | X Obey Bernoulli distribution, random variable with a value of 0 or 1, by maximizing the Posterior ProbabilityP (Y | X) to obtain the linear classifier. SVM follows the principle that the interval between classes (margin) is the largest. According to this criterion, the classification surface has the following characteristics: the distance between the closest projection distance to the sample points in the vertical direction of the classification surface is the minimum for different classes. The equation for this classification is: w'x + B = 0, and gamma is a volume directly proportional to margin, called function margin ), the goal of our optimization is Max Gamma. Constraints: (1) y (I) [w'x (I) + B]> = gamma; (2) Normalization of W modulus, that is, | w | = 1. Change the constraint to: Y (I) [W'/gamma * X (I) + B]> = 1, replace w/gamma with W, then the SVM solution becomes the quadratic programming (qP) Problem: min | w |; S. t y (I) [w'x (I) + B]> = 1. MATLAB comes with the quadprog command to solve the QP problem. However, this is not the case in general engineering implementation. On the one hand, it is inefficient to implement it. On the other hand, this form is not conducive to extending SVM to a high-dimensional space, that is, kernel SVM. Generally, engineering implementation usually chooses to solve dual problem. If an optimization problem meets the kkt condition, it is generally called the original problem primal problem) it can be converted into dual problem ). For the original problem of linear SVM, its dual problem is: Max W (alpha) s. t alpha (I)> = 0; y (1) * alpha (1) + Y (2) * alpha (2 )..... + Y (m) * alpha (m) = 0. After obtaining alpha (I), we can use W = alpha (1) * Y (1) * x (1) + alpha (2) * Y (2) * X (2 )..... + alpha (m) * Y (m) * X (m) to get W. Since alpha (I)> = 0, alpha (I)> 0 corresponds to X (I), y (I) [w'x (I) + B] = 1, contributes to the generation of W, which is the support vector. For Alpha (I) = 0, y (I) [w'x (I) + B]> 1, non-support vector.

The SVM described above is a linear SVM with linear differentiation of samples. In practice, samples are often not classified in a linear Euclidean space. There are two solutions to this problem: (1) map samples to a high-dimensional space, that is, kernel SVM. (2) Use soft interval (soft margin) SVM. In the dual problem of linear SVM, the objective function and decision function expression both have inner product items <X (I), x (j)>. The main idea of kernel SVM is: use a kernel function K [X (I), x (j)] = <F [X (I)], F [x (j)]> replace <X (I), x (j)>, F [X (I)] as the ing function of X (I) to the high-dimensional space, because all the operations between X (I) and X (j) in the dual problem are inner product operations, it is not necessary to explicitly find f [X (I)], you only need to provide K [X (I), x (j)]. The most used is the RBF and polynomial kernel. The basic idea of soft interval SVM is to introduce the penalty item sigma (I) and the penalty coefficient C for situations where linear differentiation is not possible. Min (| w | ^ 2)/2 + C * (SIGMA (1) + sigma (2) + ...... SIGMA (m); S. t y (I) [w'x (I) + B]> = 1-sigma (I ). Y (I) [w'x (I) + B]> = 1-sigma (I) means that a certain error score can be tolerated; Objective Function C * (SIGMA (1) + sigma (2) + ...... SIGMA (m) indicates that the wrong Branch Brings punishment, which avoids the excessive number of samples with incorrect scores and degrades the classification performance. The SMO method is generally used to solve the SVM dual problem. The specific implementation is provided by reference.

Reference: http://v.163.com/movie/2008/1/C/6/M6SGF6VB4_M6SGJVMC6.html

Http://v.163.com/movie/2008/1/9/3/M6SGF6VB4_M6SGJVA93.html

Http://blog.csdn.net/pennyliang/article/details/7103953

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More