From the previous article, we know that the support vector (Supervector) refers to those points closest to the separation of the hyper-plane. The most necessary step for the whole SVM is to train the classifier to get alpha, so that the entire partition of the data is divided over the plane. The general application process for support vector machines (super vector MACHINE,SVM) is as follows:
(1) Collect data: You can use any method
(2) Prepare data: numeric data required
(3) Analysis data: Helps visualize the separation of the hyper plane
(4) Training algorithm: Most of the time of SVM originates from training, which mainly realizes the tuning of two parameters
(5) test algorithm: A very simple calculation process can be achieved
(6) Use algorithm: SVM can be used for almost all classification problems. SVM itself is a class two classifier, the application of SVM to multi-class problems need to make some changes to the code
In order to reduce the training time and improve the efficiency of SVM, a sequence minimization (sequential Minimal optimizaton,smo) algorithm is introduced. The SMO algorithm solves the problem of large optimization by decomposing it into several small optimization problems. These small optimization problems are often easy to solve, and the results of sequential solution are consistent with the results of solving them as a whole.
The SMO works based on the coordinate ascent algorithm.
1, coordinate ascent
Assume that the optimization problem is:
We select one of the parameters in turn to optimize this parameter, which causes the W function to grow fastest.
The entire process can be represented in Figure 1.
Figure 1
2. SMO
The SMO algorithm selects two parameters in each loop for processing, which is more than one parameter in coordinate ascent.
From the previous article, we know that the optimization problem is expressed as:
From the (19) formula
As you can see, choosing a parameter without changing the other parameters does not change the parameter, so it does not achieve the goal of optimization. So the SMO algorithm chooses two parameters to optimize.
Replace the result with a parameter
So it is possible to express the (20) type in diagram
Figure 2
As can be seen from Figure 2, from the (20) formula, you can push the export
So we know
As a constant, the optimization of the remaining two parameters can be expressed as
Then according to the (20) formula can be obtained, thus according to the previous article can be separated over the plane for classification.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Lesson8 Impressions of "machine learning" at Stanford-------1, SMO