Machine Learning Feasibility Analysis

Source: Internet
Author: User

Machine learning Feasibility analysis (1)

1 , No Free Lunch

Machine Learning is not all-powerful, and machine learning is done by learning sample D and speculating about other cases outside of sample D. If there is any unknown f in sample D, using machine learning is doomed to fail. My understanding is that there are some relationships in sample d that are not f, and that machine learning cannot be guaranteed to learn a relationship G like F, so machine learning is not possible. Three examples in a reference courseware

English Description:

Learning from D (to infer something outside D) are doomed if any ' unknown ' F can happen.

2 , how to calculate the ratio of orange marbles in large jars?

This is a probability statistic problem, by calculating the case of the sample and estimating the overall situation.

The proportion of the "big" jar's marbles is not calculated, and the sample's orange marbles are calculated by randomly grabbing the sample.

The sample results obtained from this statistic method V, the overall actual situation u, sample number N, error

3 and the relationship between the pot theory and machine learning time

Machine learning is very similar to the above-mentioned sample calculation of the ratio of marbles.

For a given h, the error rate in sample D (N Records), the error rate outside the sample, also has the relationship of the Hough inequality:

Other words

In real machine learning, the probability of error increases with the increase of H when confronted with multiple h choices. Suppose there are M H:


Therefore, if space is assumed to be limited (M), if the given sample space is large enough, the probability of bad sample occurring is very large. Learning is effective at this time.

The next section discusses when the hypothetical space is infinitely large.

Machine Learning Feasibility Study (2) 1. Review important formulas

When the space H is assumed to be finite, the size is M, and the sample space n is large enough, there is always a training error rate. The appropriate machine learning algorithm a makes the training error rate of the sample space close to zero, and using the sample space to train the G to estimate the sample space, the learning is effective at this time.

2. Assuming space size H:M

According to the above formula, machine learning is invalid when m is infinitely large.

The main reason is that the use of the union bound when calculating m, so that the upper bound is too loose. In fact, because there are a lot of overlaps between the different assumptions and the bad, we can actually get a much smaller upper bound than M.

3. growth function (growthfunctions): Describes the relationship between the hypothetical space m and the size of the sample n

Positive rays:

Positive intervals:

Convex sets:

Perceptron:n>3

4. breakoutpoint: For a hypothetical space H, if M (k) <2^k, then K is its breakout point, the smallest k, called the minimum breakout

For the hypothetical space H where there is a breakout,

5.VC bound

The upper bound of the polynomial is obtained according to the break point, and I want to replace m in the previous inequality. However, the direct substitution is a problem, the specific replacement method, the strict proof is more complex, the result is the famous VC community (VC bound)

At the end of the courseware, we give an example

, ε=0.1,n=10000, calculates the VC bound of bad events, gets 0.298

Machine Learning Feasibility Analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.