Coursera Machine Learning Course note-Hazard of Overfitting

Source: Internet
Author: User

This section is about overfitting, listening to the understanding of overfitting more profound than before.

First introduced the overfitting, the consequence is that Ein is very small, and eout is very large. Then the causes of overfitting are analyzed separately.

The first is to discuss the cross-fitting of horizontal type complexity. The more complex the model, the more likely it is to produce overfitting. The reason for this result is given: the number of samples is too small. This is in the case of a noise,

So what if there's no noise?

Use the following two images to indicate the effect of each parameter:

It can be seen that both noise and model complexity have an effect on overfitting. It is important to note that the area in the lower-left corner of the right image. A complex model is used

It is equivalent to producing a noise, which is called deterministic noise (more difficult to understand), and corresponds to stochastic noise. The reason for summing up the fit is four:

The data volume n is few, the random noise is high, the certainty noise is high, the excessive VC dimension.

Finally, we propose a method for solving overfitting, including data cleaning/pruning, data hinting, regularization (regularization), confirmation (validation), and

To drive for example to illustrate the role of these methods, the latter two methods are also the contents of the following two lessons.

Data cleaning/pruning is to correct or delete the wrong sample points, processing is simple, but usually such sample points are not easy to find.

Data hinting generate more sample numbers by generating virtual samples

Coursera Machine Learning Course note-Hazard of Overfitting

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.