Linear regression with regularization
When our λ is very large, hθ (x) ≈θ0, is a straight line, will appear underfit, when our λ is very small (= 0 o'clock), that is equivalent to not do regularization, will appear overfit; only when our λ takes the intermediate value, will be just right. So how do we automatically choose this lambda value?
Jtrain (θ), JCV (θ), Jtest (θ) expressions in regularization
Jtrain (θ), JCV (θ), Jtest (θ) expression without regularization, cost function J (θ) has regularization formula
How to choose regularization parameterλ (model selection for λ)
Calculate the value of the different λ values (from 0 to 10.24 (for ease of writing 10), the next is twice times the previous cost function, and then calculate JCV (θ), parametersθ Select the model with the smallest JCV (θ), such as θ (5) and its corresponding λ value, and then calculate the test error to see its prediction accuracy.
Jtrain (θ) (how-well my hypothesis does on trainning set), JCV (θ) (what's well my hypothesis does on validation set) and λ relationship
When λ is very small, it can be thought that there is no regularization, there will be overfitting (high variance) phenomenon, that is, jtrain (θ) is very little but jcv (θ) is very large, when λ is very large, there will be underfit (high bias) phenomenon, that is jtrain ( θ) and JCV (θ) are very large;
The above diagram is a bit idealistic, the actual data drawn out of the graph may be some noise and twists and turns, but the approximate direction of the curve is consistent, so we can draw such a graph to see if we set the value of λ is appropriate, or through the programming to automatically select the most appropriate lambda value
Bias vs. Variance (2)--regularization and bias/variance, how to choose the right regularization parameterλ (model selection)