Research on statistical analysis technology of R language--principle and application of Ridge regression technology

Source: Internet
Author: User

Principle and application of Ridge regression technology

author Ma Wenmin

Ridge regression analysis is a biased estimation regression method dedicated to collinearity analysis, which is essentially an improved least squares estimation method, which is more consistent with the actual and more reliable regression method by discarding the unbiased nature of least squares, obtaining the regression coefficients at the cost of loss of partial information and reducing the accuracy. The tolerance of pathological data is much stronger than the least square method.

Regression analysis: He is a statistical analysis method to establish the quantitative relationship between two or more variables. Using a very wide range of regression analysis according to the amount of design, divided into one-yuan regression and multivariate regression analysis, according to the number of dependent variables, can be divided into simple regression analysis and multiple regression analysis, according to the number of independent variables and dependent variable types can be divided into linear regression analysis and nonlinear regression analysis. If in regression analysis, only one argument and dependent variable are included, and the relationship can be approximated by a straight line, this regression analysis is called unary linear regression analysis. If the regression analysis includes two or more independent variables, and there is a linear correlation between the arguments, it is called multiple regression analysis

The principle of Ridge regression: The principle of ridge regression is more complicated. According to the Gausmarkov theorem, multiple correlations do not affect the unbiased and least variance of least squares estimators, but although the least squares estimator is the least variance in all linear unbiased estimators, the variance is not necessarily minimal. In fact, a biased estimate can be found, although the estimate has a smile bias, but his accuracy can be much higher than unbiased estimates. Ridge regression analysis is based on this principle, by introducing a partial constant in the normal equation to obtain the regression estimates, the specific situation can be consulted data.

For some matrices, a small change of an element in the matrix can cause a large error in the final calculation, which is called a pathological matrix. Sometimes incorrect calculation methods can also cause a normal matrix to appear morbid in the operation. In the case of Gaussian elimination, if the elements on the main element are small, they will show morbid characteristics in the calculation.

The square value of the ridge regression equation is slightly lower than that of common regression analysis, but the significance of regression technique is often higher than that of common regression, which has great use value in the research of collinearity problem and pathological data bias.

Application of Ridge Regression: application in Poultry breeding: This paper discusses the method of estimating poultry fertility in mixed linear mode equations by Ridge regression method, in essence, the traditional mixed linear model equations are understood as a generalized ridge regression estimation, which provides a way to determine the estimation of genetic parameters, meanwhile, taking Muscovy duck as an example, Considering a trait and two fixed effects, the generalized ridge regression was used to estimate the breeding of the male Muscovy ducks, and compared with the best linear unbiased prediction method, the results showed that the generalized ridge regression method and the Blup method were very similar to the cultivated planting and its sequencing, and the correlation coefficient and rank correlation coefficient reached 0.998 and 0.986, and the error rate predicted by generalized Ridge regression method is very low, which indicates that it is feasible to use generalized Ridge regression to estimate animal fertility method in mixed linear model equations, and the process of estimating genetic parameters can be omitted, so the application of Blup method in animal breeding is more practical.

The simulation of satellite photographic data combining forward and reverse: the simulation of satellite photographic data is usually made by forward simulation and reverse simulation method. The forward simulation method is simple and easy, no substitution calculation is needed, but the ground point coordinate has a large difference in y direction, and the inverse simulation can avoid the difference in y direction, but it must be based on the existing DEM data, and the range of DEM data is basically consistent with the scope of the foreign element. Simulation data is subject to data source conditions.

Reference files

Baidu -----NPC Economic Forum

Baidu ------ Baba

Firefox browser

Research on statistical analysis technology of R language--principle and application of Ridge regression technology

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.