SVM, risk minimization of experience, VC dimension

Source: Internet
Author: User
Tags svm

Original: http://blog.csdn.net/keith0812/article/details/8901113

The support vector machine method is based on the VC dimension Theory of statistical learning theory and the minimum principle of structural risk.

Structured risk

Structured risk = empirical risk + confidence risk

Empirical risk = error of the classifier on a given sample

Confidence risk = Error of the result that the classifier classifies on unknown text

Confidence Risk Factors:


The number of samples, the larger the number of samples, the more likely the learning result is correct, at this time the confidence risk is smaller;
The VC dimension of the classification function, obviously the VC dimension is bigger, the promotion ability is worse, the confidence risk will become bigger.

Increase the number of samples, reduce the VC dimension, reduce confidence risk.

Before the goal of machine learning is to reduce the experience risk, to reduce the risk of experience, it is necessary to improve the complexity of the classification function, resulting in a high VC, VC Vego, confidence risk is high, so, the structural risk is also high. ----This is where SVM has an advantage over other machine learning.

SVM can reduce the VC dimension, the main one is the introduction of the kernel function.

In front of this part of the knowledge is in the learning of SVM when the other People's blog, at that time on the VC dimension is not very understanding, see many times are foggy. But in later study found this probability often appear, to a lot of algorithms can not have a part of the correct understanding, today summon up the courage to learn again the concept of VC, collation as follows:

Example: a linear two classification function can break a set that contains only three elements so the VC dimension of the linear two classification function is 3

Abstract: A set of functions that can be sprinkled with a set of H elements called the VC dimension of the function set is H

Speaking of which, we may not understand the principle of breaking up the theorem, that is, with the two classification function as an example

Suppose there is a collection of three elements, and these three elements should exist 2^3 that are 8 forms apart, as follows:

The linear Two classification function can realize this requirement, so the VC dimension of the linear two classification function is 3.

Also for sets with h elements, if a function set can be separated by 2^h, we call the VC dimension of this function set h

If there are functions for any number of samples, they can be scattered. The VC dimension of the function set is infinite. That is, the set of functions can break apart a collection that contains any element.

VC Dimension Definition Application

The researchers concluded from the analysis that the requirement that the empirical risk minimization learning process is consistent is that the VC dimension of the function set is limited, and the convergence speed is the fastest.

Personal understanding, if a VC dimension infinity, that the function set can break up a collection containing any element. Then this function must be very complex to meet this condition, if a function is too complex, the generalization ability of this function will decrease, the training experience risk will increase, the convergence speed will also slow down.

SVM, risk minimization of experience, VC dimension

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.