Machine learning--machine learning application recommendations

Last Update:2016-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Application Recommendations for machine learning

For a long time, the machine learning notes have not been updated, the last part of the updated neural network. This time we'll talk about the application of machine learning recommendations.

Decide what to do next

Suppose we need a linear regression model (Linear Regression) to predict house prices, and when we use the well-trained model to predict unknown data, we find that there is a greater error, what can we do next?

Getting more training examples-often effective, but costly-may also be effective, and there are several ways to take precedence.
Try to reduce the number of features
Try to get more features
Try adding two-item features
Try to reduce the degree of normalization λ
Try to increase the degree of normalization λ

instead of randomly selecting one of the methods above to improve our algorithm, we use machine learning diagnostics to help us know which of these methods are effective for our algorithm.

Hypothetical evaluation (evaluating A hypothesis)

Over fitting test

To check whether the algorithm is over-fitting, we divide the data into training sets and test sets, usually using 70% of the data as the training set, leaving 30% of the data as the test set. It is important that both the training set and the test set contain various types of data, usually we shuffle the data and then divide it into training sets and test sets.

Test set evaluation

After the training set allows our model to learn its parameters, we use this model for the test set, and we have two ways of calculating the error:

For linear regression models, we use test set data to calculate the cost function J
for the logistic regression model, we can use the test data set to calculate the cost function in addition to: You can also calculate the ratio of error classifications, for each test set instance, calculate: It then averages the results of the calculations.

Model selection (cross-validation set)

Let's say we choose between 10 different number of two-item models:

Obviously the higher the number of two-item models can adapt to our training data set, but the adaptive training data set does not mean that can be generalized to the general situation, we should choose a more adaptable to the general situation of the model. We need to use a cross-validation set to help select the model, namely:

Use 60% of the data as a training set
Use 20% of the data as a cross-validation set
Use 20% of the data as a test set

the model selection method is:

10 models trained using training set
Cross-validation error (Value of cost function) is calculated for cross-validation set with 10 models
Select the model with the lowest cost function value
Use the model selected in step 3 above to calculate the generalization error (value of the cost function) for the test set.

Bias and deviation diagnosis (diagnosis Bias VS. Variance)

The problem of high bias and high deviation is basically the problem of low fitting and overfitting.

We usually help with the analysis by plotting the cost function errors of the training set and cross-validation set on the same chart as the number of polynomial:

For the training set, when D is small, the model fit is lower, the error is large, and with the increase of D, the fitting degree increases and the error decreases.
For cross-validation sets, when D is small, the model is less fit, the error is large, and as D grows, the error appears to decrease and then increase, and the turning point is when our model begins to fit the training data set.

If we have a large cross-validation set error, how can we determine whether it is biased or biased?

According to the chart above, we know:

When the training set error and cross-validation set error are similar: bias/low-fit
Cross-validation set error is much larger than training set error: deviation/over-fitting

Normalization and bias/deviation

In the course of training the model, we usually use some normalization methods to prevent overfitting. But we may be normalized too high or too small, that is, when we choose the value of λ, we also need to think about the number of times the polynomial model has just been selected.

We choose a series of lambda values that we want to test, typically twice times the value of the relationship between 0-10, such as: 0,0.01,0.02,0.04,0.08,0.16,0.32,0.64,1.28,2.56,5.12,10 a total of 12.

We also divide the data into training sets, cross-validation sets, and test sets.

The method for selecting λ is:

Using training set 12 different degrees of normalized models
Cross-validation errors are calculated by using 12 models for cross-validation sets
Choosing the model with the least cross-validation error
Using the model selected in step 3 to obtain the generalization error of the test and calculation

When λ is small, the error of the training set is small (over-fitting) and the cross-validation set error is large
With the increase of λ, the error of training set is increasing (low fitting), while the cross-validation set error decreases first and then increases

Learning curve (learning Curves)The learning curve is a good test of the Learning Algorithm (sanity check), and the learning curve is a graph that draws the training set error and the cross-validation set error as a function of the number of instances of the training set (m). That is, if we have 100 data, we start with the 1th data and gradually learn more data. The idea is: when training less data, the training model will be able to perfectly adapt to less training data, but the training model is not well adapted to cross-validation set data or test data set.
How to use the learning curve to identify high bias/low fit: As an example, we try to use a straight line to adapt to the following data, as you can see, no matter how much error in the training set is not greatly improved:
In other words, adding data to a training set is not necessarily helpful in the case of high bias/low fit. How to use the learning curve to identify high deviations/overfitting: Suppose we use a very high polynomial model, and the normalization is very small, we can see that when the cross-validation set error is much larger than the training set error, adding more data to the training set can improve the effect of the model.
In other words, adding more data to the training set can improve the algorithm's effectiveness in the case of high deviations/overfitting.
decide what to do nextLooking back at the previous six optional next steps, let's take a look at the circumstances under which we should choose:

Get more training examples-solve high-variance
Try to reduce the number of features-resolving high deviations
Try to get more features-solve high bias
Try adding a two-item feature-solving high bias
Try to reduce the degree of normalization-addressing high bias
Try to increase the degree of normalization--to resolve high deviations

Bias and bias of neural networks

Using smaller neural networks, similar to those with fewer parameters, tends to lead to high bias and low fit, but less computationally expensive
The use of larger neural networks, similar to the parameters of a large number of cases, prone to high deviations and over-fitting, although the computational cost is relatively large, but can be adjusted by means of normalization and more adaptive data

It is usually better to choose a larger neural network and use normalization to treat it than to use a smaller neural network effect. For the choice of layer number of hidden layer in neural network, the layer is gradually increased from the first layer, in order to make a better choice, the data can be divided into training set, cross-validation set and test set, the neural network is trained for different hidden layer number, then the neural network with the least cost of cross-validation set is chosen.

Machine learning--machine learning application recommendations

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine learning--machine learning application recommendations

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine learning--machine learning application recommendations

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support