Evaluation and selection of "Machine learning 2nd Learning Notes" model

Source: Internet
Author: User

1. Training error: The error of the learner in the training set, also known as "experience Error"

2. Generalization error: The error of the learner on the new sample

Obviously, our goal is to get a better learner on a new sample, which is a small generalization error.

3. Overfitting: The learner learns the training sample too well, leading to a decline in generalization performance (learning too much ...). Let me think of some people bookworm, reading dead books, rigid, not flexible and extrapolate)

Reason: Learning ability is too strong, some of the less common features also learned in

Measures: A key obstacle to machine learning is the poor solution

4. Under-fitting: Even the training set is not learned, let alone generalization (a bit glimpse, elephant meaning).

Reason: Low learning ability

Measures: More easily overcome, such as the extension branch in decision Tree, neural network learning to increase the number of training wheels.

Second, model evaluation method

Different learning algorithms + different parameter configurations = different Models

So how do you find the model with the least generalization error (which is our ideal model)?

1. Leave the method

Divide the existing data set in our hands into two mutually exclusive collections, using one of them as training set training model, using another to test the model, test the error, as the estimation of the generalization error

2. Cross-validation method

Divide the DataSet D into K mutually exclusive subsets, each time using a subset of K-1 to train the model, leaving a subset to test the model, the exception to "Leave a method"

3. Self-help method

There are several concepts to be aware of:

Validation set: The data set used to evaluate tests in model evaluation and selection

Test data: The data that has been encountered in the actual use of the learned model

Training data: Divided into training sets and validation sets

Third, performance measurement

Evaluation criteria for measuring generalization ability of models

Evaluation and selection of "Machine learning 2nd Learning Notes" model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.