The function of linear regression analysis in R is LM ().
(1) Unary linear regression
We can analyze whether the strength of the alloy is related to the carbon content according to the above data.
First read the data into R using the following command:
x <-C (seq (0.10,0.18,by = 0.01), 0.20,0.21,0.23)
Y <-C (42.0,43.5,45.0,45.5,45.0,47.5,49.0,53.0,50.0,55.0,55.0,60.0)
Plot (x, y)
Draw to get a linear relationship between x, y two variables
Therefore, the LM () function can be used to fit the line, and the regression function lm () results in the following:
The Lm.sol <-lm (y~1+x) # #lm () function returns the object that fits the result, and you can view its contents with the summary () function.
Summary (LM.SOL)
Regression results:
Call:
LM (formula = Y ~ 1 + x)
Residuals:
Min 1Q Median 3Q Max
-2.0431-0.7056 0.1694 0.6633 2.2653
Coefficients:
Estimate Std. Error t value Pr (>|t|)
(Intercept) 28.493 1.580 18.04 5.88e-09 * * *
X 130.835 9.683 13.51 9.50e-08 * * *
---
Signif. codes:0 ' * * * ' 0.001 ' * * ' 0.01 ' * ' 0.05 '. ' 0.1 "1
Residual standard error:1.319 on ten degrees of freedom
Multiple r-squared:0.9481,adjusted r-squared:0.9429
f-statistic:182.6 on 1 and DF, p-value:9.505e-08
Results Analysis:
Among them, two regression coefficients are 28.493 and 130.835, the result is also a T value, and two P-values, the lower the P-value, the more significant regression effect, and the higher the star later.
Penultimate line r-squared The closer the number is to 1, the better the regression effect.
Therefore, the regression analysis of this example has significant effect, the regression line is: y=28.493+130.835x.
After the regression analysis can be done to make predictions, that is, given an X value, you can find the probability of the Y-value of 0.95 of the corresponding interval. The Predict () function can be implemented in R:
> New <-data.frame (x = 0.16) # #注意, a value also writes out the form of a data frame.
> lm.pred <-Predict (lm.sol,new,interval = "prediction", level = 0.95) # #加上interval = "Prediction", indicating that the corresponding prediction interval is given.
> lm.pred
Fit LWR UPR
1 49.42639 46.36621 52.48657
(2) Multivariate linear regression
First read the data into R:
> x1 <-C (76.0,91.5,85.5,82.5,79.0,80.5,74.5,79.0,85.0,76.5,82.0,95.0,92.5)
> x2 <-C (50,20,20,30,30,50,60,50,40,55,40,40,20)
> y <-C (120,141,124,126,117,125,123,125,132,123,132,155,147)
> MyData <-data.frame (x1,x2,y)
To do linear regression:
> Lm.sol <-lm (y ~ x1 + x2,data = mydata)
> Summary (LM.SOL)
Results:
Call:
LM (formula = y ~ X1 + x2, data = MyData)
Residuals:
Min 1Q Median 3Q Max
-4.0404-1.0183 0.4640 0.6908 4.3274
Coefficients:
Estimate Std. Error t value Pr (>|t|)
(Intercept)-62.96336 16.99976-3.704 0.004083 * *
X1 2.13656 0.17534 12.185 2.53e-07 * * *
X2 0.40022 0.08321 4.810 0.000713 * * *
---
Signif. codes:0 ' * * * ' 0.001 ' * * ' 0.01 ' * ' 0.05 '. ' 0.1 "1
Residual standard error:2.854 on ten degrees of freedom
Multiple r-squared:0.9461,adjusted r-squared:0.9354
f-statistic:87.84 on 2 and DF, p-value:4.531e-07
From the above results, the regression coefficient and regression equation test are significant, so the regression equation is: y=-62.96366+2.13656x1+0.40022x2
As with a unary regression, you can also use the Predict () function to make predictions:
Below to predict the blood pressure of a man aged 40 years of weight 80KG:
> New <-data.frame (x1 = 80,x2 = 60)
> lm.pred <-Predict (lm.sol,new, interval = "prediction", level = 0.95)
> lm.pred
Fit LWR UPR
1 131.9743 124.6389 139.3096
Learning basic knowledge of R language (v): Linear regression analysis in R