Background
The previous talk describes the SVM problem from the perspective of duality, but it is always necessary to calculate the original data feature converted data. This tells you that by using a kernel (kernel function) technique, you can omit the feature conversion calculation, but you can still take advantage of the characteristics of the feature transformation.
?
What is kernel
The kernel is actually the operation of combining the vector feature transformation with the dot product operation, as below,
Conceptually simple, but not all feature conversion functions have kernel characteristics.
?
Kernel of SVM
In the dual SVM solution, three places are used to kernel
- Calculate intercept B
- Calculating the Q matrix in QP
- Forecast classification
The exact value of the kernel,w is not calculated, as there is no place to use W directly. That's why the previous talk has taken so much energy to describe the dual SVM solution.
?
Common kernel
Common kernel are polynomial, Gaussian and linear, each with pros and cons.
?
Linear kernel
Do not do feature conversion, direct use. You do not need to use the dual technique to directly use the linear hard SVM solution.
Advantages: High computational efficiency, good interpretation of results.
Disadvantage: Need data linear can be divided
?
Multi-item Kernel
A polynomial expansion of x, generally in the form of
Where A,b,q is a constant.
Pros: Less stringent data requirements than linear kernel
Disadvantage: There are more coefficients to be chosen; Q too general convention exceeds the accuracy of some computers, generally q<=3.
?
Gauss kernel
Some of the data is also known as RBF (Radial Base Function), and the general form is
where a (>0) is a constant. Gaussian kernel is a great place to map the original data x into the wireless dimension space, and x to take the example of A=1
The above changes were made using the Taylor expansion and then
Where features are transformed into
In this way, the conversion to the wireless dimension is completed, the RBF is not very powerful!
Pros: Fewer factors to debug, more powerful than linear and polynomial, almost adaptable to all data, and less prone to computational accuracy problems
Cons: Wireless dimensions cannot be explained, too powerful, easy to fit, and computationally expensive.
?
Summarize
Personal feeling, kernel function is the pen of SVM finishing touch, really admire to discover kernel scientist. In the actual use of SVM, a large part of the effort may be to choose kernel and correlation coefficients. Kernel can also be customized, but some conditions need to be met, depending on the relevant section of the handout.
Machine learning techniques--Learning notes 03--kernel skills