Machine learning-about overfitting (Overfitting)

Source: Internet
Author: User

Machine learning is the process of selecting the optimal model in the model space, the so-called optimal model, and can well fit the existing data set, and correctly predict the unknown data.

So how to evaluate the pros and cons of a model, using the cost function to measure the degree of error prediction. There are many cost functions, and in Ng's video, Linear regression uses the square cost function:

The logistic Regression is used for the logarithmic likelihood cost function:

For a given set of data with m samples, the average loss is called empirical risk.

The smaller the loss function, the better the model.

Let's analyze that house price forecast problem, assuming that the price area A, floor L, number of rooms n related, then our goal is to learn from the machine to get a model about A,L,N, this model can predict the price, but the question is we input should use a? Or a squared, or a three-time square? (also on the L and N questions) This is an infinite problem, in theory, the higher the number of test data can be fitted to the better, but the higher the number of times, the more complex the model, then there will be the problem of fitting, such as the fourth small figure, with 9 square to predict, the model can fit each point, However, this model is often well-predicted for known parameters, and the unknown is poorly predicted.

When the complexity of the model increases, the training error becomes smaller, so that most of the points in the sample can be fitted, but the test error decreases with the increase of the complexity and then increases, there is a minimum value.

In order to solve the problem of overfitting, we introduce "regularization term", which has the function of choosing the empirical risk and the complexity of the model. This problem translates into the minimum value of the sum of empirical risk and regularization items .

Regularization items can take different forms, with L2 norm, L1 norm:

Machine learning-about overfitting (Overfitting)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.