Bias vs. Variance (3)---Use learning curves to judge bias/variance problem

Last Update:2015-12-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Drawing learning curves can be used to check whether our learning algorithms are working properly or to improve our algorithms, we often use learning cruves to determine if our algorithm exists bias problem/variance Problem or both.

Learning Curves

is the graph of jtrain (θ) and JCV (θ) and training set size m, assuming that we use two entries to fit our trainning data.

When trainning data has only one, we can fit well, namely jtrain (θ) = 0, and when trainning data has two, we can also fit well, namely jtrain (θ) = 0; With the increase of the number of training data set, Jtrain (θ) also increases;

When trainning data is very small, the generalization (generalization) of the predictive function is very weak, so the JCV (θ) is very large, with the increase of the number of raining data set, the enhancement of generalization capability (the ability to adapt to new samples), JCV (θ) decreases.

Learning curves with high bias--adding training data is useless

When we want to use a straight line to simulate the data in the case of hypothesis in high bias, as shown, we have 5 sample points is, the line is that, we will increase the sample point to 10, the line is the same, not because we increase the number of samples will be better simulation of the data, so for the high bias algorithm, it is useless for us to add training data.

In the case of high bias, at the beginning of the sample point, Jtrain (θ) is small, and as the sample points are increasing, hypothesis can not fit too many samples (Underfit state), Jtrain (θ) is getting larger

In the case of high bias, at the beginning of the sample point, JCV (θ) is very large (because of the lack of generalization ability of the sample points), with the increase of the sample point, the JCV (θ) becomes smaller, smaller to a value will tend to be flat (relative or very large value), that will not change our hypothesis.

in the case of high bias, jtrain (θ) and JCV (θ) tend to have a similar value as the sample grows . (High error)

Learning curves with high variance--added training data is helpful

When our algorithm is in the high variance case, as shown in X has 100 square (assuming the case) and λ value is very small, then our hypothesis is at variance.

For the case of only 5 trainning data, our hypothesis can fit well, that is, when the training set size is small, jtrain (θ) is smaller, and as the training set size increases, Hypothesis do not fit each point, then jtrain (θ) will rise, but still relatively small;

For only 5 trainning data, we have a overfitting phenomenon, when JCV (θ) is very large, with the increase of the sample, our generalization ability, JCV (θ) decreased, but with jtrain (θ) there is a gap( Indicates JCV (θ) >>jtrain (θ), overfitting's performance), at this point if we extend M, i.e. enlarge training set size,jtrain (θ) up, Jcv (θ) drops as shown. Therefore, it is helpful to add training data .

The learning curve in both cases are ideal, and the actual situation can be somewhat different (there may be some noise and disturbing curves), but there will be basic similar results that can help us see if our learning algorithm is in the high Bias/high variance/ or both. So when we want to improve the performance of a learning algorithm, we usually draw learning curve, which allows us to see bias or variance problem

Bias vs. Variance (3)---Use learning curves to judge bias/variance problem

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Bias vs. Variance (3)---Use learning curves to judge bias/variance problem

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Bias vs. Variance (3)---Use learning curves to judge bias/variance problem

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support