Git:https://github.com/linyi0604/machinelearning

Regularization:

Improve the generalization ability of the model on unknown data

Avoid parameter overfitting

Regularization commonly used methods:

Increase the penalty for a parameter on the target function

Reduce the impact of a certain parameter on the result

L1 regularization: Lasso

The L1 norm Vector penalty is added after the objective function of the linear regression.

X is the sample feature of the input

W is the parameter of each characteristic that is learned

n is the number of times

B is offset, intercept

|| w| | 1 is the L1 norm of the characteristic parameter, as a penalty vector

K for the strength of the punishment

L2 Norm regularization: Ridge

The L2 norm Vector penalty is added after the objective function of the linear regression.

X is the sample feature of the input

W is the parameter of each characteristic that is learned

n is the number of times

B is offset, intercept

|| w| | 2 is the L2 norm of the characteristic parameter, as a penalty vector

K for the strength of the punishment

The following simulation predicts the cake price according to the size of the cake

A 4-time linear model is used, which is an overfitting model.

Two regularization methods were used to study and predict each other.

1 fromSklearn.linear_modelImportlinearregression, Lasso, Ridge2 #importing the polynomial feature generator3 fromSklearn.preprocessingImportPolynomialfeatures4 5 6 " "7 regularization:8 improve the generalization ability of the model on unknown data9 Avoid parameter overfittingTen regularization commonly used methods: One increase the penalty for a parameter on the target function A reduce the impact of a certain parameter on the result - - L1 regularization: Lasso the the L1 norm Vector penalty is added after the objective function of the linear regression. - - f = w * X^n + B + k * | | w| | 1 - + x is the sample feature of the input - W is the parameter of each characteristic that is learned + N is the number of times A B is offset, intercept at || w| | 1 is the L1 norm of the characteristic parameter, as a penalty vector - K for the strength of the punishment - - L2 Norm regularization: Ridge - the L2 norm Vector penalty is added after the objective function of the linear regression. - in f = w * X^n + B + k * | | w| | 2 - to x is the sample feature of the input + W is the parameter of each characteristic that is learned - N is the number of times the B is offset, intercept * || w| | 2 is the L2 norm of the characteristic parameter, as a penalty vector $ K for the strength of the punishmentPanax Notoginseng - the The following simulation predicts the cake price according to the size of the cake + A 4-time linear model is used, which is an overfitting model . A two regularization methods were used to study and predict each other . the + " " - $ #training data, characteristics and target values for the sample $X_train = [[6], [8], [10], [14], [18]] -Y_train = [[7], [9], [13], [17.5], [18]] - #Preparing test Data theX_test = [[6], [8], [11], [16]] -Y_test = [[8], [12], [15], [18]]Wuyi #four-time linear regression model fitting thePoly4 = Polynomialfeatures (degree=4)#4-time polynomial feature generator -X_train_poly4 =poly4.fit_transform (X_train) Wu #Building Model Predictions -Regressor_poly4 =linearregression () About Regressor_poly4.fit (X_train_poly4, Y_train) $X_test_poly4 =poly4.transform (x_test) - Print("four-time linear model prediction score:", Regressor_poly4.score (X_test_poly4, Y_test))#0.8095880795746723 - - #learning and predicting using L1 norm regularization linear model ALasso_poly4 =Lasso () + Lasso_poly4.fit (X_train_poly4, Y_train) the Print("the prediction of L1 regularization is divided into:", Lasso_poly4.score (X_test_poly4, Y_test))#0.8388926873604382 - $ #learning and predicting using L2 norm regularization linear model theRidge_poly4 =Ridge () the Ridge_poly4.fit (X_train_poly4, Y_train) the Print("the prediction of L2 regularization is divided into:", Ridge_poly4.score (X_test_poly4, Y_test))#0.8374201759366456

It is better to compare the generalization ability of the regularization model.

Machine learning path: Python linear regression overfitting L1 and L2 regularization