(vii) Some of the techniques used in machine learning

Source: Internet
Author: User

This article is about how to better apply machine learning algorithms in practice, such as the following empirical risk minimization issues:

When solving the optimal, it is found that his error is very large, then how to deal with the current loss function value as small as possible? Here are a few options, and here's how to choose the right way to help with these strategies.

When the variance of the model is large, there may be overfitting, so you can try to increase the sample or reduce the characteristics or increase the regular parameters .

When the bias of the model is large, there may be an under-fitting, then try to add more features or add multiple features or reduce the regular parameters .

First of all, the general mechine learning problem, we will divide the data into training set , cross-validation set , test set , the proportion is 6:2:2 respectively.

In this way, the three-brother formula can be used to calculate the loss of the hypothetical function on three sets respectively:

Next, use the cross-validation set to find the optimal, and use it to test the machine to get the test error Jerr ():

Bias. Variance.

If the current algorithm performance is not very good JCV or jtest is very high, you can draw the following about bias and variance figure to determine where the problem, if Jtrain and JCV are too high, then the Bais problem, the model is still in the state of under-fitting, or jtrain relative JCV is very low , the variance is the problem.

The selection of regularization items, with the increase in the value of the parameters will be less and less, the model is in an under-fitting state, the deviation Bais will be more and more large, jtrain will grow

On the cross-validation set, when the value is very large, the model may be in an over-fitting state, the variance will be large, and the JCV will be reduced to the minimum value, where the minimum point is the Bais and the variance are more balanced. As we continue to grow, JCV will also start to increase, which will eventually lead to a larger bais. So here the JCV is the best in the case of the minimum value.

Increase training data

First of all, with the increase in training data, the data from the cross-validation set and the test set will increase according to the ratio of 6 2 2.

1) When using a relatively suitable model, when the data is relatively small, jtrain will be perfect fitting training data, but at this time JCV will be relatively large, because the data of the model is very difficult to fan the cross-validation set, the increase in data will lead to jtrain increase, JCV decrease, at this time the effect of increasing data will be better.

2) When using a model that is not fitted, it can cause Jtrain to be very large, at which point the amount of data will not have any effect, because the JCV will not become smaller, and the model will not have a good effect on both the test set and the training set.

3) When using an over-fitting model, when the data is small, the interval between Jtrain and JCV will be larger, at this time increase the amount of data, the effect will be some ascension, Jtrain and JCV interval will be reduced, this is so-called increased training data to avoid overfitting.

At this time

(vii) Some of the techniques used in machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.