Consider the dual SVM problem: If the original input variable is non-linear transform, then in two times planning to calculate the Q matrix, it is faced with: first do the conversion, then do the inner product, if the number of converted items (such as 100-time polynomial conversion), then the time spent is more.
Is it possible to synthesize the Transform+inner product as a step in the calculation of the Q matrix?
Some kernel trick are used here.
Simply put: some special forms of transfrom, using the kernel trick is only used to calculate the input vector before the product (x ' x), transform after the inner product results can be represented by X ' X, so that the computational complexity is maintained in the O (d).
As an example of the following 2nd order polynomial transform:
all z ' z can be substituted with K (x,x ') as long as the kernel trick can be used .
Next, if you can use the kernel trick, what are the benefits of calculating dual SVM?
Places where K (x,x ') can be used are:
(1) When calculating the qn,m, it can be used directly
(2) When calculating bias B, you can use it.
(3) When a test sample is entered, it can be used directly (otherwise the conversion is done, and then the formula is calculated)
Therefore, the QP process of Kernel SVM can be simplified by using Kernel trick in all aspects.
The following sections describe several commonly used kernel types.
General Poly-2 Kernel
The benefits of this K2 kernel:
(1) In the calculation is very concise, only need to calculate X ' x, plus 1 square is OK.
(2) without losing its generality, because in the course of QP, the coefficients before the constant term, the first term, and two times are eaten with optimization.
However, the coefficients in front of the K2 kernel are not selectable, as this affects the final w (that is, margin definition).
K2 different coefficients, the selected SV is also different.
In general, start with the simplest SVM and then gradually become more complex.
Gaussian SVM
Gaussian SVM is an infinite-dimensional polynomial transformation.
The advantage of infinite dimension is the learning power enhancement, the disadvantage is that the parameter selection is not good ....
Even if the SVM over-fitting is still present, the Gaussian SVM should be used with caution.
Several kernel each have the advantage, but the principle still must use cautiously, can use simple not to be complicated.
Heights Field Machine learning technology, "kernal support Vector Machines"