4. Lasso regression and Ridge (Ridge) regression __ Machine learning

Source: Internet
Author: User
4. Lasso regression and Ridge (Ridge) regressionPDF version Download address: https://pan.baidu.com/s/1i5JtT9j HTML version download address: Https://pan.baidu.com/s/1kV0YVqv LASSO from 1996 Robert Tibshirani first proposed that the full name least absolute shrinkage and selection operator Ridge regression, also known as Ridge regression, Tychonoff regularization (Tikhonov regularization), Is the most commonly used regularization method for the regression analysis of ill-posed problem.

The previous model selection has said that the more complex the model parameters, such as the current Data feature x x dimension is very high, even if I use linear regression still have a lot of parameters need to train, this will cause a certain degree of fitting. And the resulting model is not highly explanatory. It is time to consider introducing lasso regression.
The ridge regression is a kind of biased estimation regression method for collinearity data analysis, which is essentially an improved least squares estimation method, which gives up the unbiased of least squares, and obtains more reliable regression coefficients at the cost of partial information loss and precision reduction. A person to leave a little bottom line, return can be too extreme. 1 Basic Forms

  The characteristic of lasso regression is that the variable selection (Variable Selection) can be performed while fitting the training data. So what kind of mechanism does it choose? The answer is: regularization (regularization). Or you can simply call this thing a punitive term.

A brief review of the loss function of linear regression: L (w) =1n∑i=1n (Yi−f (xi)) 2=1n| | y−xw| | 2 L (W) = \frac{1}{n} \sum\limits_{i = 1}^{n} (y_i-f (x_i)) ^2 = \frac{1}{n}| | y-xw| | ^2
The analytic solution can be obtained: w∗= (xtx) −1xty w^* = (X^TX) ^{-1}x^ty

So when the x x input space dimension is very large, there may be a problem of fitting. So here we introduce the regular term, which is to make some restrictions on w W. So our optimization problem from the original minw12| | y−xw| | 2 \mathop{min}\limits_{w}\frac{1}{2}| | y-xw| | ^2 became: minw12| | y−xw| | 2,s.t.| | w| | 1<θ\mathop{min}\limits_{w}\frac{1}{2}| | y-xw| | ^2, s.t.| | w| | _1

  The Ridge regression also controls the model coefficients by adding a regular term. Its optimization problem is expressed as:
Minw

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.