Machine Learning 3-after class: using the ridge regression and lasso algorithm to select variables

Source: Internet
Author: User

      • Topic
      • Get ready
        • 1 preparing to install and load packages
        • 2 read-in data
      • Multi-collinearity Check
        • 1 All variables participate in linear regression
        • 2 All variables participate in linear regression
      • Ridge return
        • 1 All variables do ridge regression
        • 1 Remove X3 and do ridge return
      • Use Linearridge to do ridge regression automatically
      • Lasso
        • 1 Variable selection Order
        • 2 Which variables to select

1. Topics

Using ridge regression and lasso to solve the regression problem of the Shiry book, page No. 279 (PDF p331) Example 6.10

2. Preparation 2.1. Preparing to install and load packages

Using functions and corresponding packages in the R language

function function Package
Lm.ridge provides ridge regression functions Ridge
Linearridge Automatic Ridge parameter selection, Cule (2012) MASS
Lars Provides a regression model of minimum angular regression and lasso Lars

Description: How to find which function is provided by which package: http://cran.rstudio.com/->task views->machine learning-> Search "keyword, such as Lars"

The execution code is as follows

install.packages("lars"#http://cran.rstudio.com/ ->TASK Views->Machine Learning-> search larslibrary#library#lm.ridge函数在ridge包library##linearRidge函数在MASS包
2.2. Read-in data
Cement<- data. Frame(X1=C(7, 1,11,11,7,11,3,1,2,21,1,11,10),X2=c ( -, in, About, to, the, -, in, to, Wu, -, +, the, the),X3=c (6, the,8,8,6,9, -, A, -,4, at,9,8),X4=c ( -, the, -, -, -, A,6, -, A, -, the, A, A),Y=c (78.5,74.3,104.3,87.6,95.9,109.2,102.7,72.5,93.1,115.9,83.8,113.3,109.4) )
3. Multi-collinearity Check 3.1. All variables participate in linear regression

Summary (LM (y~.,data=cement))

According to observation, the intercept and the variable can not pass the coefficient test, and the residual standard deviation ratio is large

3.2. All variables participate in linear regression

To solve the condition number of the sample phalanx (kappa value), it can be judged that there is a serious multiple collinearity between the variables. ( k<100,说明共线性程度小,如果100< k< 1000,有较强的多重共线性,k>1000,在严重的多重共线 )

>kappa(cor(cement),exact=TRUE)[1] 2009.756
4. Ridge regression 4.1. All variables do ridge regression

K value [0,0.1], step 0.001. Solving ridge regression estimation families under different K values

#lm.ridge(Y~.,cement,lambda=seq(0,0.1,0.001))plot(lm.ridge(Y~.,cement,lambda=seq(0,0.1,0.001)

The graph below, a line coefficient absolute value is always very small, and from the positive through to negative, need to discard.
Since there is no marker on the graph which variable, combined with each variable k[0,0.1] on the coefficient β value list, only X3 is from positive to negative, you can judge this line is X3.

To remove the linear regression after X3:
Summary (LM (y~.-x3,data=cement))

According to observation, X2 and X4 coefficient test is still not good, and residual standard deviation ratio is large, still unsatisfied, need to do further ridge regression

4.1. Remove X3 and do ridge return

Variable selection according to Ridge regression

    • When the normalization is centralized, the coefficient of rejection can be very small.
    • With increasing, unstable (positive and negative shuttle), the variable that the vibration tends to zero can be eliminated

Remove X3 and do Ridge return:



Combined graph and β value list, found X4 coefficient is small, can delete X4 variable

Linear regression after removing X3 and X4:
Summary (LM (y~.-x3-x4,data=cement))

The detection of the coefficient of intercept and the variable is three stars.
The final model is: y^ = 52.577 + 1.4683x1 + 0.6622x2.

5. Use Linearridge to do ridge regression automatically

Automatically take the ridge parameter Ridge parameter:0.01473162, discard X3

> Summary (Linearridge (y~.,data=cement)); #自动取了岭参数Ridge parameter:0.01473162, discard x3,call:linearridge (formula = Y ~., data = cement) coefficients:Estimate scaled Estimate Std. Error (scaled) t value (scaled) Pr (>|t|) (Intercept) 83.7040 na na na na X1 1.2922 26.3321 3.6721 7.171 7.45e-13 ***x2 0.2977 16.0463 3.9883 4.02 3 5.74e-05 ***x3-0.1478-3.2790 3.5979 0.911 0.362x4-0.3506-20.3290 3.9963 5.087 3.64e-07 * * *---Signif. codes:0 ' * ** ' 0.001 ' ** ' 0.01 ' *' 0.05 '. ' 0.1 ' 1Ridge parameter:0.01473162, chosen automatically, computed using 2 pcsdegrees of Freedom:model 3.01, variance 2.837, residual 3.183

After removing the X3, I found that X4 had only one point.

> summary(linearRidge(Y~.-X3,data=cement));#发现X4还只有一个点Call:linearRidge(formula = Y ~ . - X3, data = cement)Coefficients:            Estimate Scaled estimate Std. Error (scaled) t value (scaled) Pr(>|t|)    (Intercept)  72.8790              NA                  NA               NA       NA    X1            1.4436         29.4160              2.2433           13.113  < 2e-16 ***X2            0.4003         21.5778              7.8141            2.761  X4           -0.2501        -14.5015              7.8464            1.848  0.06458 .  ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*

After removing X3 and X4

> summary(linearRidge(Y~.-X3-X4,data=cement));#发现X4还只有一个点Call:linearRidge(formula = Y ~ . - X3 - X4, data = cement)Coefficients:            Estimate Scaled estimate Std. Error (scaled) t value (scaled) Pr(>|t|)    (Intercept)  52.7494              NA                  NA               NA       NA    X1            1.4629         29.8090              2.3450            12.71   <2e-16 ***X2            0.6595         35.5511              2.3450            15.16   <2e-16 ***---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*

Conclusion: the coefficient test of each variable is three stars, the final regression model:
y^ = 52.5749 + 1.4629x1 + 0.6595x2.

6.lasso6.1. Variable selection Order

The execution code is as follows:

> w=aslibrary(lars) #> lars(w[,1:4],w[,51:450.982Sequenceof LASSO moves:     X4 X1 X2 X3Var   4  1  2  3Step  1  2  3  4
6.2. Which variables to select

Plot (Lars (w[,1:4],w[,5])), through the chart found X3 has been zero, combined with summary, can be removed x3

> Summary (Lars (w[,1:4],w[,5])) Lars/lassoCall :Larsx= w[,1:4],y= w[,5]) Df RssCp0  1 2715.76 442.91671  2 2219.35 361.94552  3 1917.55 313.50203  4   47.97   3.01844  5   47.86   5.0000

In the third step the CP indicator (Mallows's CP) has the lowest value, and the residuals are RSS and relatively small.
The result of the third step is x4, x1, x2 variable, so the final variable selection is x4, x1, x2 variable.

Linear regression after removing X3
Summary (LM (y~.-x3,data=cement))

found that the coefficient of X4 test is very poor, and then remove the X4 (remove X4 is not through the Lasso method, but lasso after screening, and then through the coefficient of linear regression test artificial judgment), and then get the final formula.

Machine Learning 3-after class: using the ridge regression and lasso algorithm to select variables

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.