Knowledge part of machine learning theory--deviation variance balance (bias-variance tradeoff)

Source: Internet
Author: User

Summary:

1. Frequently Asked Questions

1.1 What is deviation and variance?

1.2 Why is there any way to prevent or overcome overfitting?

2. Model Selection

3. Feature Selection

4. Feature Engineering and data preprocessing

Content:

1. Frequently Asked Questions

1.1 What is deviation and variance?

The Generalization error (general error) can be decomposed into deviations (bias) squared plus variance (variance) plus noise (noise). The deviation measures the expected prediction of the learning algorithm and the deviation degree of the real result, depicts the ability of the learning algorithm to fit, and the variance measures the change of learning performance caused by the change of the same size training set, and depicts the influence caused by the data disturbance. Noise expresses the lower bound of expected generalization error which can be achieved by any learning algorithm in the current task, and describes the difficulty of the problem itself. The general training degree is stronger, the deviation is smaller, the variance is bigger, the generalization error generally has a minimum value in the middle, if the deviation is large, the variance is small, at this time is generally called under the fitting, but the deviation is small, the variance is bigger is called overfitting.

1.2 Why is there any way to prevent or overcome overfitting?

Generally in machine learning, the error of the learner in the training set is called the training error or the experience error, and the error on the new sample is called the generalization error. Obviously we want to get a learner with a small generalization error, but we don't know the new sample beforehand, so we actually try to minimize the error of experience. However, when a learner learns to train a sample too well, it is often possible to consider the characteristics of the training sample itself as the general nature of the potential sample. This leads to a decrease in generalization performance, which is called overfitting, whereas under-fitting generally refers to the general nature of the training sample that has not yet been studied, and there are still large errors in the training set.

Under-fitting: Generally less fitting is easier to solve, such as increasing the complexity of the model (increasing the branches in the decision tree, increasing the number of training in the neural network, etc.), increasing the features ("Combination", "generalization", "correlation"), and reducing the regularization factor (reference).

Over fitting: The solution of overfitting usually has to re- clean the data (which causes the overfitting to be caused by the data impure), increases the sample size, reduces the model complexity, uses prior knowledge (L1,L2 regularization), uses cross-validation, Early stopping and so on .

  

  

Knowledge part of machine learning theory--deviation variance balance (bias-variance tradeoff)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.