Howding Inequalities and real machine learning

Source: Internet
Author: User

1. Howding Inequalities

In a jar, there are a lot of small balls, they are divided into two colors {orange, green}. Randomly grab n balls from a jar. Set: The percentage of orange balls in the jar is μ (unknown), and the ratio of the orange balls in the sample is ν (known). According to the howding inequality in probability theory (hoeffding's inequality) if n is large enough, ν is likely to be close to μ.

Similarly, in machine learning: N is large enough to use [H (x) ≠f (x)] on dataset D to speculate on the [H (x) ≠f (x)] on {χ}. That is, if the sample is large enough, the ratio of the alternative function h to the error on D is close to the proportion of its error on {χ}. An alternative function h is e-in (h) on D, and the percentage of errors in the entire input set is E-out (h):

By means of the above formula, it is possible to measure its correctness according to the performance of the alternative function h on D, and finally to select the optimal h as G from the alternative function set H, and g≈f.

2. Real machine learning

To give an example, 150 people each toss a coin 5 times, at least one person 5 times is the probability of the head upward is 1-(31/32) ^150 = 99.15% so a small probability event if repeated several times, the probability of his occurrence will become very large.

Similarly, the following scenario is possible: Learning algorithm A in the alternative function set H (contains a lot of h) diligently selected H, suddenly found a hi, found that it did not make a mistake on D or only a few mistakes, a happy shout: I found G, is this hi! But in fact this hi on {χ} made a lot of mistakes (Ein (HI) and Eout (HI) is far from the difference. For this hi, D is a bad sample. H may extract several samples di,{i= 1, 2,3 ...}, for some h, some of these samples are bad sample. Because Eout big (far from f), but Ein small (correct on the most examples)

For arbitrary sample D and given H, there is

Bad data for many H
??No ' freedom of choice 'ByA
??there exists some h such thateout (hand ein (hfar away

The following 4 propositions are equivalent on the entire set of alternate function sets H (with M elements):

---d is the bad sample of H---D is some H's poor sample--learning algorithm A cannot be freely screened in h---Existence of some H makes e-in (h) Far from E-out (h)

According to the above table, it can be seen that training data sets such as D-1126 are relatively high-quality.

Given any d, it is the probability that some H's bad sample is:

The less the number of alternative functions in H, the larger the sample data n, the smaller the probability that the sample becomes a bad sample. At an acceptable probability level, learning algorithm a only needs to pick the best-performing H as the G-line. That is, the number of h in the above requirement is limited.

Howding Inequalities and real machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.