1. Kernel functions map Low-dimensional space to high-dimensional space
The following chart is located in the first to second quadrant. We pay attention to the red door and the purple letters under the words "Beijing courtyard". We think of the dots on the red Door as "+" data, and the dots on the purple letters as "-" data, and their horizontal and vertical coordinates are two characters. Obviously, in this two-dimensional space, the "+" "-" two types of data are not linearly divided.
We now consider the kernel function K (v1,v2) =<v1,v2>2 k (v_1,v_2) = ^2, i.e. "inner product squared".
This inside v1= (x1,y1) v_1= (x_1,y_1), v2= (x2,y2) v_2= (x_2,y_2) are two points in two-dimensional space.
This kernel function corresponds to a mapping of a two-dimensional space to a three-dimensional space, and its expression is:
P (x, y) = (x2,2‾‾√xy,y2) p (x, y) = (x^2,\sqrt{2}xy,y^2)
In the case of the P-map, the image in the original two-dimensional space is like this in three-dimensional space:
The front and rear axes are the x axis, the left and right axes are the Y axis, and the upper and lower axes are Z axes
Notice that the green plane can perfectly segment red and purple, meaning that two types of data become linearly divisible in three-dimensional space.
The three-dimensional decision boundary is then mapped back to the two-dimensional space:
This is a hyperbola, it is not linear.
As the above example says, the function of the kernel is to imply a mapping from a low-dimensional space to a high-dimensional space, which can be linearly divided into two types of points that are not linear in the low-dimensional space. Of course, the specific example I'm lifting relies strongly on the location of the data in the original space. The kernel functions used in fact are often much more complex than this example. Their mappings are not necessarily explicitly expressed, and the dimensions of the high-dimensional spaces they map to are much higher, even infinitely dimensional, than the example (three-dimensional) I cite. Thus, it is expected that the two types of points which are not linearly divided can be linearly divided. 2. Common kernel functions
The kernel functions commonly used in machine learning, such as those that are in LIBSVM:
1) Linear: K (v1,v2) =<v1,v2> k (v_1,v_2) =
2) polynomial: K (v1,v2) = (γ<v1 , V2>+c) n K (v_1,v_2) = (\gamma+c) ^n
3) Radial basis function:k (V1,V2) =exp (−γ| | V