is the definition of the function interval
consider the minimum value of R hat in the training sample, which corresponds to the worst case scenario of the function interval in the training sample:Geometry interval:
= R hat/| | w | |
definition of the optimal interval classifier:
Lagrange duality: slightly.
to Dual*=primary*=l (w*,α*,β*), w*,α*,β* satisfies Kkt dual complementarity condition (KKT dual complementary condition):
optimize interval classifier:Consider the definition of this classifier so that:
The Lagrangian optimization problem is available:
for W, b the partial derivative is as follows:In addition, there are:
And then get:
kernel function: (not understood)The kernel function is used instead of the inner product in the above, and the variable is mapped to the higher dimension space. This is good for calculating the inner product without having to load the vector into memory (in fact, it does not fit).
The corresponding mappings for this type are:
Gauss nucleus:How to judge a nucleus to be valid:
that is: K is a valid nucleus equivalent to its corresponding nuclear matrix is a symmetric semi-definite matrix
in the case where the data is non-linear:
called L1 norm soft margin SVM. is a convex optimization problem. It allows an interval of less than 1, which allows for the categorization of errors.
SMO algorithm:
coordinate ascent algorithm:
This algorithm has more iterations, but at some point the inner loop will be very fast if a parameter in W (A1,,, am) is very small at the cost of finding the optimal value.
SMO:
If only one α is solved as SVM, the other α is fixed. obtained by equation (19)
that is, α is fixed. The SMO simultaneously solves two α, which is then obtained:
This formula is a unary two times function, easy to get α1.
Andrew NG asked for answers to the following two questions in John Platt's paper:
SMO algorithm:by the equationthe problem is converted to α, as follows:The solution for each parameter in this equation is as follows:
PS: Unconsciously long time do not write notes.
Stanford public Class machine learning Fifth Chapter SVM notes