Assess loss of performance assessment
1. Training Error
The first thing to do is train the model with data, and select the part of the data as the training data.
The loss function can be computed using methods such as absolute error or squared error, where squared error is used, i.e. (y-f (x))2
Use this method to calculate the error, and then calculate all the data points and find the average.
Training the smaller the Error, the better the model? The answer is no, let's look at the relationship between training Error and model complexity.
From the graph above, you can see that the smaller the training error, the more complex the model becomes, and then the phenomenon of overfitting.
It is very likely that there are individual anomalous data points in the training data, and if all the data points are over-fitted, it will lead to the model overfitting, and it is not very good to forecast the house price;
Training error is small, and does not indicate that it is a good prediction.
2, generalization (true) error true errors
The first point is that the value is not calculated;
To calculate the true error, you first need to know the real value, the data in the training data does not necessarily represent the real value, but can be estimated by the average in the training data.
For example, to calculate house A's housing prices, find all the houses that are similar to a, and calculate the average price. To estimate prices.
Here's a look at the relationship between true error and model complexity:
The real value in the image, the middle position of the color lighter in the reference image
It can be seen that the model is simple and the model is too complex to predict the data very well.
3. Test Error
Test error is close to True error, and testing data from test error is derived from the testing data set.
The calculation error of the test data set is similar to the training error.
Training, True, & test error and model complexity comparison:
Test error fluctuates around true error, close to the value of true error.
As can be seen, summed up the judgment of fit:
Estimate parameter W' in the model
1. Training error (W) < training error (W ')
2. True Error (W) > True error (w ')
Description W over fitting
3 Sources of error
Noise, Bias, Variance
1. Noise Noise
Of an inherent, irreducible, or reduced nature.
2, Bias Deviation
The simpler the model, the greater the deviation
The more complex the model, the smaller the deviation
3. Variance Variance
Simple model, small variance
Complex model, large variance
Deviations and variance tradeoffs, deviations and variances cannot be calculated
Training error and the amount of test data, fixed model complexity, a small amount of data to fit better, the error will be smaller, with the increase in the amount of data, the error is also increased, will reach a critical point and true error equal.
True error and test data amount of the relationship, fixed model complexity, a small amount of data true error will be larger, with the increase in the amount of data, the error will be reduced to a critical point and training error equal.
Coursera Machine learning:regression Evaluation Performance