Machine learning--machine learning application recommendations

Source: Internet
Author: User

Application Recommendations for machine learning

For a long time, the machine learning notes have not been updated, the last part of the updated neural network. This time we'll talk about the application of machine learning recommendations.


Decide what to do next

Suppose we need a linear regression model (Linear Regression) to predict house prices, and when we use the well-trained model to predict unknown data, we find that there is a greater error, what can we do next?

    1. Getting more training examples-often effective, but costly-may also be effective, and there are several ways to take precedence.
    2. Try to reduce the number of features
    3. Try to get more features
    4. Try adding two-item features
    5. Try to reduce the degree of normalization λ
    6. Try to increase the degree of normalization λ
instead of randomly selecting one of the methods above to improve our algorithm, we use machine learning diagnostics to help us know which of these methods are effective for our algorithm.


Hypothetical evaluation (evaluating A hypothesis)

Over fitting test

To check whether the algorithm is over-fitting, we divide the data into training sets and test sets, usually using 70% of the data as the training set, leaving 30% of the data as the test set. It is important that both the training set and the test set contain various types of data, usually we shuffle the data and then divide it into training sets and test sets.

Test set evaluation

After the training set allows our model to learn its parameters, we use this model for the test set, and we have two ways of calculating the error:

    • For linear regression models, we use test set data to calculate the cost function J
    • for the logistic regression model, we can use the test data set to calculate the cost function in addition to: You can also calculate the ratio of error classifications, for each test set instance, calculate: It then averages the results of the calculations.

Model selection (cross-validation set)

Let's say we choose between 10 different number of two-item models:


Obviously the higher the number of two-item models can adapt to our training data set, but the adaptive training data set does not mean that can be generalized to the general situation, we should choose a more adaptable to the general situation of the model. We need to use a cross-validation set to help select the model, namely:

    • Use 60% of the data as a training set
    • Use 20% of the data as a cross-validation set
    • Use 20% of the data as a test set
the model selection method is:

    1. 10 models trained using training set
    2. Cross-validation error (Value of cost function) is calculated for cross-validation set with 10 models
    3. Select the model with the lowest cost function value
    4. Use the model selected in step 3 above to calculate the generalization error (value of the cost function) for the test set.

Bias and deviation diagnosis (diagnosis Bias VS. Variance)

The problem of high bias and high deviation is basically the problem of low fitting and overfitting.


We usually help with the analysis by plotting the cost function errors of the training set and cross-validation set on the same chart as the number of polynomial:


    • For the training set, when D is small, the model fit is lower, the error is large, and with the increase of D, the fitting degree increases and the error decreases.
    • For cross-validation sets, when D is small, the model is less fit, the error is large, and as D grows, the error appears to decrease and then increase, and the turning point is when our model begins to fit the training data set.
If we have a large cross-validation set error, how can we determine whether it is biased or biased?

According to the chart above, we know:

    • When the training set error and cross-validation set error are similar: bias/low-fit
    • Cross-validation set error is much larger than training set error: deviation/over-fitting

Normalization and bias/deviation

In the course of training the model, we usually use some normalization methods to prevent overfitting. But we may be normalized too high or too small, that is, when we choose the value of λ, we also need to think about the number of times the polynomial model has just been selected.


We choose a series of lambda values that we want to test, typically twice times the value of the relationship between 0-10, such as: 0,0.01,0.02,0.04,0.08,0.16,0.32,0.64,1.28,2.56,5.12,10 a total of 12.

We also divide the data into training sets, cross-validation sets, and test sets.

The method for selecting λ is:

    1. Using training set 12 different degrees of normalized models
    2. Cross-validation errors are calculated by using 12 models for cross-validation sets
    3. Choosing the model with the least cross-validation error
    4. Using the model selected in step 3 to obtain the generalization error of the test and calculation
    • When λ is small, the error of the training set is small (over-fitting) and the cross-validation set error is large
    • With the increase of λ, the error of training set is increasing (low fitting), while the cross-validation set error decreases first and then increases

Learning curve (learning Curves)The learning curve is a good test of the Learning Algorithm (sanity check), and the learning curve is a graph that draws the training set error and the cross-validation set error as a function of the number of instances of the training set (m). That is, if we have 100 data, we start with the 1th data and gradually learn more data. The idea is: when training less data, the training model will be able to perfectly adapt to less training data, but the training model is not well adapted to cross-validation set data or test data set.
How to use the learning curve to identify high bias/low fit: As an example, we try to use a straight line to adapt to the following data, as you can see, no matter how much error in the training set is not greatly improved:
In other words, adding data to a training set is not necessarily helpful in the case of high bias/low fit. How to use the learning curve to identify high deviations/overfitting: Suppose we use a very high polynomial model, and the normalization is very small, we can see that when the cross-validation set error is much larger than the training set error, adding more data to the training set can improve the effect of the model.
In other words, adding more data to the training set can improve the algorithm's effectiveness in the case of high deviations/overfitting.
decide what to do nextLooking back at the previous six optional next steps, let's take a look at the circumstances under which we should choose:
    1. Get more training examples-solve high-variance
    2. Try to reduce the number of features-resolving high deviations
    3. Try to get more features-solve high bias
    4. Try adding a two-item feature-solving high bias
    5. Try to reduce the degree of normalization-addressing high bias
    6. Try to increase the degree of normalization--to resolve high deviations
Bias and bias of neural networks
    • Using smaller neural networks, similar to those with fewer parameters, tends to lead to high bias and low fit, but less computationally expensive
    • The use of larger neural networks, similar to the parameters of a large number of cases, prone to high deviations and over-fitting, although the computational cost is relatively large, but can be adjusted by means of normalization and more adaptive data
It is usually better to choose a larger neural network and use normalization to treat it than to use a smaller neural network effect. For the choice of layer number of hidden layer in neural network, the layer is gradually increased from the first layer, in order to make a better choice, the data can be divided into training set, cross-validation set and test set, the neural network is trained for different hidden layer number, then the neural network with the least cost of cross-validation set is chosen.

Machine learning--machine learning application recommendations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.