Stanford Machine Learning video note WEEK6 on machine learning recommendations Advice for applying machines learning

Source: Internet
Author: User

We will learn how to systematically improve machine learning algorithms, tell you when the algorithm is not doing well, and describe how to ' debug ' your learning algorithms and improve their performance "best practices". To optimize machine learning algorithms, you need to understand where you can make the biggest improvements. We will discuss how to understand the performance of a machine learning system with multiple parts and how to handle skewed data.

Evaluating a hypothesis

Imagine what you would do next when you trained a model that had a lot of bias in predicting the data.

This takes time to achieve, but it can help you a lot, so that you do not blindly make decisions to improve the algorithm, but rather visually see which is effective for the Ascension algorithm.

We divide the data into two sets, Training set (70%), Test set (30%)

The algorithm evaluates the process as follows:

1. Learning parameters from training set (by minimizing error J (theta));

2. Calculate the test Error J (tests).

Here is how the error is calculated for linear regression and classification problems:

Model Selection and Train/validation/test sets

Selecting a training error from multiple assumptions can only indicate that it has a good fit effect on the training set, it may also be overfit, and then cause prediction to be poor.

So here we divide the dataset into 3:

Here is an example of previous house price forecasts.

The calculation method for each error type is as follows:

Model Selection Method:

1. Use training Set minimization Jtrain (theta) to get the parameter theta;

2. Using the parameters of the training, the error is calculated on the cross Validation set, and the model of minimizing JCV (theta) is found as the final model of the training;

3. The test set is used to calculate the generalization error on the trained model, and the evaluation algorithm is good or bad.

Diagnosing Bias vs. Variance

We've all talked about Underfit and Overfit before that. So when your model results are not ideal, how can you tell what kind of situation is happening?

High Bias (underfit): Training error Jtrain and cross-validation Jcv are highly, jcv~jtrain

High Variance (Overfit): Jtain is small, JCV is large and >>jtrain.

Regularization and Bias/variance

The following explains how to choose regularization parameter to avoid underfit and overfit.

In addition to our objective function, the other jtrain, JCV, and jtest are not calculated using lambda, and the formula is calculated as above.

Selection process:

1. List all possible lambda values, and the teacher recommends a twice-fold increase in value each time.

2. Create a hypothesis model, H (Theta).

3. Traverse all lambda to learn the parameters via Minj (theta).

4. In the CV set use the training parameters theta Calculate the error JCV, select the theta that make JCV minimum;

5. Test the generalization error on test set with the learned theta parameter.

The following is a function image of JCV and jtrain about lambda. Note: We are only using lambda in objective funtion J (theta), not Jtrain and JCV.

It can be seen that as the lambda increases, the jtrain is monotonically increasing, while the JCV decreases to a inflection point and then increases.

And we need the algorithm to have a small JCV, here is a "just right", that is, the JCV of the minimum point, is the lambda we need to select.

Learning Curves

The error and training set size are used as function images as learning cruvers.

The following is the case where the algorithm is in high deviation (underfit).

The judgment model is in high Bias:

Sample less: Jtrain low, JCV high;

More samples: Jtrain, JCV are high, and Jtrain ~JCV

If the algorithm is in high bias, adding more training samples will not be much help to improve the model.

The following is the case where the algorithm is in high variance (Overfit)

The algorithm is in high variance:

Sample less: Jtrain low, JCV high;

Multiple samples: Jtrain Rise and rise, jcv decrease and decrease, jtrain < JCV and size are obvious.

If the algorithm is in high variance, it can be helpful to increase the training sample.

Deciding do Next revisited

Review the questions raised at the beginning of this lesson and how to improve your algorithm? After the discussion above, we can get the following conclusions:

On the Underfit and overfit of neural networks and their solutions.

The above discussion is very important to build a good machine learning algorithm, and can save a lot of time, less detours.

Stanford Machine Learning video note WEEK6 on machine learning recommendations Advice for applying machines learning

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.