SVM, risk minimization of experience, VC dimension

Last Update:2015-07-08 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original: http://blog.csdn.net/keith0812/article/details/8901113

The support vector machine method is based on the VC dimension Theory of statistical learning theory and the minimum principle of structural risk.

Structured risk

Structured risk = empirical risk + confidence risk

Empirical risk = error of the classifier on a given sample

Confidence risk = Error of the result that the classifier classifies on unknown text

Confidence Risk Factors:

The number of samples, the larger the number of samples, the more likely the learning result is correct, at this time the confidence risk is smaller;
The VC dimension of the classification function, obviously the VC dimension is bigger, the promotion ability is worse, the confidence risk will become bigger.

Increase the number of samples, reduce the VC dimension, reduce confidence risk.

Before the goal of machine learning is to reduce the experience risk, to reduce the risk of experience, it is necessary to improve the complexity of the classification function, resulting in a high VC, VC Vego, confidence risk is high, so, the structural risk is also high. ----This is where SVM has an advantage over other machine learning.

SVM can reduce the VC dimension, the main one is the introduction of the kernel function.

In front of this part of the knowledge is in the learning of SVM when the other People's blog, at that time on the VC dimension is not very understanding, see many times are foggy. But in later study found this probability often appear, to a lot of algorithms can not have a part of the correct understanding, today summon up the courage to learn again the concept of VC, collation as follows:

Example: a linear two classification function can break a set that contains only three elements so the VC dimension of the linear two classification function is 3

Abstract: A set of functions that can be sprinkled with a set of H elements called the VC dimension of the function set is H

Speaking of which, we may not understand the principle of breaking up the theorem, that is, with the two classification function as an example

Suppose there is a collection of three elements, and these three elements should exist 2^3 that are 8 forms apart, as follows:

The linear Two classification function can realize this requirement, so the VC dimension of the linear two classification function is 3.

Also for sets with h elements, if a function set can be separated by 2^h, we call the VC dimension of this function set h

If there are functions for any number of samples, they can be scattered. The VC dimension of the function set is infinite. That is, the set of functions can break apart a collection that contains any element.

VC Dimension Definition Application

The researchers concluded from the analysis that the requirement that the empirical risk minimization learning process is consistent is that the VC dimension of the function set is limited, and the convergence speed is the fastest.

Personal understanding, if a VC dimension infinity, that the function set can break up a collection containing any element. Then this function must be very complex to meet this condition, if a function is too complex, the generalization ability of this function will decrease, the training experience risk will increase, the convergence speed will also slow down.

SVM, risk minimization of experience, VC dimension

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More