1. Ridge Regression and lasso are used to solve the regression problem of Xue Yishu In the 279th pp. 6.10.
For example, question 6.10 is as follows:
650) This. width = 650; "src =" http://www.dataguru.cn/kindeditor/attached/image/20140501/20140501171754_87741.jpg "width =" 600 "Height =" 381 "style =" border: none; "/>
Enter the data in the example, generate the dataset, and perform simple linear regression to view the results.
Cement <-data. frame (x1 = C (7, 1, 11, 11, 7, 11, 3, 1, 2, 21, 1, 11, 10), X2 = C (26,
29, 56, 31, 52, 55, 71, 31, 54, 47, 40, 66, 68), X3 = C (6, 15, 8, 8, 6,
9, 17, 22, 18, 4, 23, 9, 8), X4 = C (60, 52, 20, 47, 33, 22, 6, 44, 22, 26,
34, 12, 12), Y = C (78.5, 74.3, 104.3, 87.6, 95.9, 109.2, 102.7, 72.5, 93.1,
115.9, 83.8, 113.3, 109.4 ))
Cement
# X1 X2 X3 X4 y
#1 7 26 6 60 78.5
#2 1 29 15 52 74.3
#3 11 56 8 20 104.3
#4 11 31 8 47 87.6
#5 7 52 6 33 95.9
#6 11 55 9 22 109.2
#7 3 71 17 6 102.7
#8 1 31 22 44 72.5
#9 2 54 18 22 93.1
#10 21 47 4 26 115.9
#11 1 40 23 34 83.8
#12 11 66 9 12 113.3
#13 10 68 8 12 109.4
Lm. Sol <-LM (Y ~ ., Data = cement)
Summary (LM. Sol)
##
# Call:
# Lm (formula = Y ~ ., Data = cement)
##
# Residuals:
# Min 1q median 3q Max
#-3.175-1.671 0.251 1.378 3.925
##
# Coefficients:
# Estimate STD. Error T value PR (> | T |)
# (Intercept) 62.405 70.071 0.89 0.399
# X1 1.551 0.745 2.08 0.071.
# X2 0.510 0.724 0.70 0.501
# X3 0.102 0.755 0.14 0.896
# X4-0.144 0.709-0.20 0.844
##---
# Signif. Codes: 0 '*** '000000' ** '000000' * '000000'. '000000' 1
##
# Residual standard error: 2.45 on 8 degrees of freedom
# Multiple r-squared: 0.982, adjusted R-squared: 0.974
# F-statistic: 111 on 4 and 8 DF, p-value: 4.76e-07
# From the result, the correlation coefficient between the intercept and the independent variable is not significant.
# Use the VIF () function in the car package to view the collinearity of each variable
Library (CAR)
VIF (LM. Sol)
# X1 X2 X3 X4
#38.50 254.42 46.87 282.51
# From the results, the VIF values of each variable exceed 10 and there are multiple collinearity. Among them, the VIF values of X2 and X4 both exceed 200.
Plot (X2 ~ X4, Col = "red", Data = cement)
650) This. width = 650; "src =" http://www.dataguru.cn/kindeditor/attached/image/20140501/20140501171832_99070.jpg "width =" 505 "Height =" 579 "style =" border: none; "/>
Next, use the LM. Ridge () function in the mass package to implement ridge regression. In the following calculation, we tried 151 Lambda values, and finally selected the one that makes the generalized cross-validation GCV the smallest.
Library (mass)
##
# Attaching package: 'mass'
##
# The following object is masked _ by _ '. globalenv ':
##
# Cement
Ridge. Sol <-LM. Ridge (Y ~ ., Lambda = seq (0,150, length = 151), data = cement,
Model = true)
Names (ridge. Sol) # variable name
# [1] "coef" "scales" "Inter" "Lambda" "ym" "XM" "GCV" "khkb"
# [9] "klw"
Ridge. Sol $ Lambda [which. Min (ridge. Sol $ GCV)] # Find the lambdagcv with the GCV minimum.
# [1] 1
Ridge. Sol $ coef [which. Min (ridge. Sol $ GCV)] # Find the minimum GCV Coefficient
# [1] 7.627
PAR (mfrow = C (1, 2 ))
# Draw the image and make the vertical line when the lambdagcv is used to obtain the minimum value
Matplot (ridge. Sol $ lambda, T (ridge. Sol $ coef), xlab = expression (lamdba), ylab = "Cofficients ",
Type = "L", lty = 1:20)
Abline (V = Ridge. Sol $ Lambda [which. Min (ridge. Sol $ GCV)])
# The following statements plot the relationship between Lambda and GCV.
Plot (ridge. Sol $ lambda, ridge. Sol $ GCV, type = "L", xlab = expression (lambda ),
Ylab = expression (Beta ))
Abline (V = Ridge. Sol $ Lambda [which. Min (ridge. Sol $ GCV)])
650) This. width = 650; "src =" http://www.dataguru.cn/kindeditor/attached/image/20140501/20140501171900_16714.jpg "width =" 546 "Height =" 279 "style =" border: none; "/>
PAR (mfrow = C (1, 1 ))
# From the perspective, Lambda's choice is not so important. As long as Lambda = 0 is too close, there is no big difference.
# The following uses the linearridge () function in the ridge package to automatically select the ridge regression parameter.
Library (ridge)
MoD <-linearridge (Y ~ ., Data = cement)
Summary (MOD)
##
# Call:
# Linearridge (formula = Y ~ ., Data = cement)
##
##
# Coefficients:
# Estimate scaled estimate STD. Error (scaled) T value (scaled)
# (Intercept) 83.704 na
# X1 1.292 26.332 3.672 7.17
# X2 0.298 16.046 3.988 4.02
# X3-0.148-3.279 3.598 0.91
# X4-0.351-20.329 3.996 5.09
# PR (> | T |)
# (Intercept) Na
# X1 7.5e-13 ***
# X2 5.7e-05 ***
# X3 0.36
# X4 3.6e-07 ***
##---
# Signif. Codes: 0 '*** '000000' ** '000000' * '000000'. '000000' 1
##
# Ridge parameter: 0.01473, chosen automatically, computed using 2 pcs
##
# Degrees of freedom: Model 3.01, variance 2.84, residual 3.18
# From the running results of the model, the regression parameter value of qiling is 0.0147, and the coefficient of each variable is expected to be significantly increased (except X3 is still not significant)
Finally, lasso regression is used to solve the problem of collinearity.
Library (LARS)
# Loaded Lars 1.2
X = As. Matrix (cement [, 1:4])
Y = As. Matrix (cement [, 5])
(LAA = Lars (X, Y, type = "Lar") # The Lars function value is used for matrix data.
##
# Call:
# Lars (x = x, y = Y, type = "Lar ")
# R-squared: 0.982
# Sequence of LAR moves:
# X4 X1 X2 X3
# Var 4 1 2 3
# Step 1 2 3 4
# The lasso variables are X4, x1, x2, and X3 in sequence.
Plot (LAA) # plot
650) This. width = 650; "src =" http://www.dataguru.cn/kindeditor/attached/image/20140501/20140501171925_25088.jpg "width =" 571 "Height =" 332 "style =" border: none; "/>
Summary (LAA) # gives the Cp Value
# Lars/lar
# Call: Lars (x = x, y = Y, type = "Lar ")
# DF RSS CP
#0 1 2716 442.92
#1 2 2219 361.95
#2 3 1918 313.50
#3 4 48 3.02
#4 5 48 5.00
# Based on the explanation of the CP meaning in the class (the smaller the value of multiple collinearity, the better), we get Step 1 to minimize the CP value, that is, select X4, X1, x2.
Use R to establish Ridge Regression and lasso Regression