Understanding support Vector Machines (ii) kernel functions

Last Update:2016-03-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Defined by the previous definition of the kernel function (see statistical learning Method definition 7.6):
Set χ is the input space (Euclidean space or discrete set), η is the feature space (Hilbert space), if there is a mapping from χ to η

φ (x): Χ→η makes for all x,z∈χ, function κ (x,z) =φ (x)? φ (z),
It is called κ (x,z) as a kernel function, φ (x) is a mapping function, φ (x)? φ (z) maps to the inner product of the feature space for X,z.
Because the mapping function is very complicated and difficult to calculate, in practice, the kernel function is usually used to solve the inner product, the computational complexity is not increased, and the mapping function is only a logical map, which represents the mapping relationship between the input space and the feature space. For example:
Set input space χ:,
mapping function φ (x) = < X,x > =
Kernel function κ (x,z) =) 〗^2 "title=" ">
Then, take two sample x= (three-way), z= (4,5,6) by mapping function and kernel function to calculate the inner product process as follows:
φ (x) = (1,2,3,2,4,6,3,6,9)
φ (z) = (16,20,24,20,25,30,24,30,36)
φ (x)? φ (z) =16+40+72+40+100+180+72+180+324=1024
and directly through κ (x,z) calculated [(4+10+18)]^2=1024
In comparison, the computation of the kernel function is obviously much smaller than the mapping function.

From the above, if we know the kernel function, then we can complete the nonlinear transformation without increasing the computational complexity. So our next task is to determine the kernel function.
What we call the kernel function is the positive definite kernel function, what function is the positive definite kernel function?
The necessary and sufficient conditions for determining the positive definite nucleus are given here (see statistical learning method theorem 7.5):
Κ:χxχ→r is a symmetric function, the necessary and sufficient condition of Κ (X,Z) is the positive definite kernel function is the m,κ matrix corresponding to any x_i∈χ,i=1,2,..., x,z (gram):

is a semi-definite matrix.
The equivalent definition of the positive definite nucleus can be given by the necessary and sufficient conditions:
Set χ as the input space, Κ (X,Z) is defined in the χxχ symmetric function, if the x_i∈χ,i=1,2 matrix corresponds to any m,κ,..., x,z (gram):
is a semi-positive definite matrix, then called κ (x,z) is a positive definite nucleus.
A function that conforms to such a condition, we call it a positive definite kernel function.
But it is worth mentioning that the definition is to determine the positive definite nucleus, there are also nuclear functions are non-positive, such as multivariate two-time kernel function:
In practical applications, we often use the Mercer theorem to determine the kernel function. The kernel function obtained by the Mercer theorem is called the Mercer Nucleus, and the definition of the positive definite nucleus and the Mercer nucleus are as follows:

As can be seen from the above definition, the positive definite nucleus is more general than the Mercer nucleus, because the positive definite nucleus requires the function to define the symmetric function on the space, while the Mercer kernel requires the function to be a symmetric continuous function.

Ii. Common kernel functions
1. Linear kernel function
The linear kernel function is the simplest kernel function and is a special case of the radial basis kernel function, the formula is:

It is mainly used for the linear variational case, which corresponds to the linear-supported vector machine and the linear support vector machine of the previous speech. It finds the optimal linear classifier in the primitive space, and has the advantage of fast parameter and low speed.
2. Polynomial kernel function
The polynomial kernel is suitable for orthogonal normalization (vector orthogonal and modulo 1) data, the formula is:
A polynomial kernel function is a global kernel function that allows data points that are far apart to have an effect on the value of a kernel function. The larger the parameter D, the higher the dimension of the map, and the greater the computational capacity. When D is too large, because the learning complexity will be too high, prone to "over-fitting phenomenon."
3. Radial basis core function & Gaussian kernel function
The radial basis kernel function belongs to the local kernel function, and the value becomes smaller when the number of points is farther away from the center point. The formula is:
The Gaussian kernel function can be seen as another form of radial basis kernel function:
The noise in the Gaussian radial basis check data has good anti-jamming ability, because of its strong locality, its parameters determine the function range, weaken with the increment of the parameter σ.
4. sigmoid kernel function
The sigmoid kernel function is derived from neural networks and is widely used in deep learning and machine learning. The formula is:
When using sigmoid function as kernel function, support vector machine is a kind of multilayer perceptron neural network. The theoretical basis of support vector machine (convex two-time plan) determines the global optimal value rather than the local optimal value, and also guarantees its good generalization ability to unknown samples.
5. String kernel function
Kernel functions can be defined not only in Euclidean space, but also on the set of discrete data. String kernel function is a kernel function defined on a set of strings, which can be intuitively understood to measure the similarity of a pair of strings, and is applied in text categorization, information retrieval and so on.

Understanding support Vector Machines (ii) kernel functions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding support Vector Machines (ii) kernel functions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding support Vector Machines (ii) kernel functions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support