Islr;r language; machine learning; linear regression
Some professional vocabulary only know English, Chinese may not standard, please light spray
12. Simple linear regression with no intercept
A) Observe the 3.38-type discoverable
When the sum of x^2 is equal to the sum of y^2, the same parameters are estimated.
b
set.seed(1)x=rnorm(100)y=2*xlm.fit=lm(y~x+0)lm.fit2=lm(x~y+0)summary(lm.fit)
Output Result:
Call:lm(formula = y ~ x + 0)Residuals: Min 1Q Median 3Q Max -3.776e-16 -3.378e-17 2.680e-18 6.113e-17 5.105e-16 Coefficients: Estimate Std. Error t value Pr(>|t|) x 2.000e+00 1.296e-17 1.543e+17 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 1.167e-16 on 99 degrees of freedomMultiple R-squared: 1, Adjusted R-squared: 1 F-statistic: 2.382e+34 on 1 and 99 DF, p-value: < 2.2e-16
Linear regression 2:
summary(lm.fit2)
Output Result:
Call:lm(formula = x ~ y + 0)Residuals: Min 1Q Median 3Q Max -1.888e-16 -1.689e-17 1.339e-18 3.057e-17 2.552e-16 Coefficients: Estimate Std. Error t value Pr(>|t|) y 5.00e-01 3.24e-18 1.543e+17 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 5.833e-17 on 99 degrees of freedomMultiple R-squared: 1, Adjusted R-squared: 1 F-statistic: 2.382e+34 on 1 and 99 DF, p-value: < 2.2e-16
The results show that the regression parameters are different
C
The sample () function is able to sample randomly from a specified collection of objects, by specifying the vector x of a class of objects, and then sampling the size from it.
For example, sampling from integers 1 to 10 and extracting 4 numbers from the sample (1:10, 4)
, get 3, 4, 5, 7. If you do it again, you get 3, 9, 8, 5. Because you choose not to put back the sample, you do not get a duplicate number.
> set.seed(1) > x=rnorm(100) > y=sample(x,100) > sum(x^2) [1] 81.05509 > sum(y^2) [1] 81.05509 > lm.fit=lm(y~x+0) > lm.fit2=lm(x~y+0) > summary(lm.fit)
Output Result:
Call: lm(formula = y ~ x + 0) Residuals: Min 1Q Median 3Q Max -2.2315 -0.5124 0.1027 0.6877 2.3926 Coefficients: Estimate Std. Error t value Pr(>|t|) x 0.02148 0.10048 0.214 0.831 Residual standard error: 0.9046 on 99 degrees of freedom Multiple R-squared: 0.0004614, Adjusted R-squared: -0.009635 F-statistic: 0.0457 on 1 and 99 DF, p-value: 0.8312
Linear regression 2:
Call: lm(formula = x ~ y + 0) Residuals: Min 1Q Median 3Q Max -2.2400 -0.5154 0.1213 0.6788 2.3959 Coefficients: Estimate Std. Error t value Pr(>|t|) y 0.02148 0.10048 0.214 0.831 Residual standard error: 0.9046 on 99 degrees of freedom Multiple R-squared: 0.0004614, Adjusted R-squared: -0.009635 F-statistic: 0.0457 on 1 and 99 DF, p-value: 0.8312
The results show that the linear regression parameters are equal when the sum of x^2 and y^2 are equal.
13.
A
> set.seed(1)> x=rnorm(100)
b
> eps=rnorm(100,0,sqrt(0.25))
C
> y=-1+0.5*x+eps
Y vector length is 100;β0=-1;β1=0.5
D
> plot(x,y)
X and Y are observed to be linear, and the slope is greater than 0.
E
> lm.fit=lm(y~x)> summary(lm.fit)
Output results
Call:lm(formula = y ~ x)Residuals: Min 1Q Median 3Q Max -0.93842 -0.30688 -0.06975 0.26970 1.17309 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.01885 0.04849 -21.010 < 2e-16 ***x 0.49947 0.05386 9.273 4.58e-15 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 0.4814 on 98 degrees of freedomMultiple R-squared: 0.4674, Adjusted R-squared: 0.4619 F-statistic: 85.99 on 1 and 98 DF, p-value: 4.583e-15
β -0=-1.01885,β -1=0.49947 is similar to β0=-1;β1=0.5, and P-values close to 0 indicate a statistically significant relationship.
F
> plot(x,y)> abline(lm.fit,lwd=3,col="red")> abline(-1,0.5,lwd=3,col="green")> legend(-1,legend=c("model fit", "pop regression"),col=2:3,lwd=3)
G
> lm.fit2=lm(y~x+I(x^2))> summary(lm.fit2)
Output Result:
Call:lm(formula = y ~ x + I(x^2))Residuals: Min 1Q Median 3Q Max -0.98252 -0.31270 -0.06441 0.29014 1.13500 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.97164 0.05883 -16.517 < 2e-16 ***x 0.50858 0.05399 9.420 2.4e-15 ***I(x^2) -0.05946 0.04238 -1.403 0.164 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 0.479 on 97 degrees of freedomMultiple R-squared: 0.4779, Adjusted R-squared: 0.4672 F-statistic: 44.4 on 2 and 97 DF, p-value: 2.038e-14
R^2 and RSE only a faint increase, x^2 t value of 0.164 shows no statistically significant relationship between Y and x^2
H
> set.seed(1)> esp1=rnorm(100,0,sqrt(0.125))> y1=-1+0.5*x + esp1> plot(x,y1)> lm.fit1=lm(y1~x)> summary(lm.fit1)
Output Result:
Call:lm(formula = y1 ~ x)Residuals: Min 1Q Median 3Q Max -0.66356 -0.21700 -0.04932 0.19071 0.82950 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.01333 0.03429 -29.55 <2e-16 ***x 0.49963 0.03809 13.12 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 0.3404 on 98 degrees of freedomMultiple R-squared: 0.6371, Adjusted R-squared: 0.6334 F-statistic: 172.1 on 1 and 98 DF, p-value: < 2.2e-16
Drawing:
> abline(lm.fit1,lwd=3,col=2)> abline(-1,0.5,lwd=3,col=3)> legend(-1,legend=c("model fit","pop. regression"),col=2:3,lwd=3)
RSE Reduction
I
> esp2=rnorm(100,0,sqrt(0.5))> y2=-1+0.5*x + esp2> plot(x,y2)> lm.fit2=lm(y2~x)> summary(lm.fit2)
Output Result:
Call:lm(formula = y2 ~ x)Residuals: Min 1Q Median 3Q Max -2.06059 -0.34104 -0.03205 0.45908 1.86787 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.98065 0.07404 -13.245 < 2e-16 ***x 0.51497 0.08224 6.262 1.01e-08 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 0.7349 on 98 degrees of freedomMultiple R-squared: 0.2858, Adjusted R-squared: 0.2785 F-statistic: 39.21 on 1 and 98 DF, p-value: 1.01e-08
Drawing:
Abline (lm.fit2,lwd=3,col=2)
Abline ( -1,0.5,lwd=3,col=3)
Legend ( -1,legend=c ("model Fit", "pop. Regression"), col=2:3,lwd=3)
RSE Increase
J
> confint(lm.fit) 2.5 % 97.5 %(Intercept) -1.1150804 -0.9226122x 0.3925794 0.6063602> confint(lm.fit1) 2.5 % 97.5 %(Intercept) -1.0813741 -0.9452786x 0.4240422 0.5752080> confint(lm.fit2) 2.5 % 97.5 %(Intercept) -1.1275711 -0.8337236x 0.3517741 0.6781604
The greater the noise, the greater the confidence interval.
14.
A
β0=2;β1=2;β2=0.3;
b
> cor(x1,x2)[1] 0.8351212> plot(x1,x2)
C
> lm.fit=lm(y~x1+x2)> summary(lm.fit)Call:lm(formula = y ~ x1 + x2)Residuals: Min 1Q Median 3Q Max -2.8311 -0.7273 -0.0537 0.6338 2.3359 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.1305 0.2319 9.188 7.61e-15 ***x1 1.4396 0.7212 1.996 0.0487 * x2 1.0097 1.1337 0.891 0.3754 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 1.056 on 97 degrees of freedomMultiple R-squared: 0.2088, Adjusted R-squared: 0.1925 F-statistic: 12.8 on 2 and 97 DF, p-value: 1.164e-05
Beta 0=2.1305; Beta 1=1.4396; Beta ? 2=1.0097
β0=2;β1=2;β2=0.3;
Because the T value is too large, we can not reject the hypothesis that β2 = 0
D
> lm.fit1=lm(y~x1)> summary(lm.fit1)Call:lm(formula = y ~ x1)Residuals: Min 1Q Median 3Q Max -2.89495 -0.66874 -0.07785 0.59221 2.45560 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.1124 0.2307 9.155 8.27e-15 ***x1 1.9759 0.3963 4.986 2.66e-06 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 1.055 on 98 degrees of freedomMultiple R-squared: 0.2024, Adjusted R-squared: 0.1942 F-statistic: 24.86 on 1 and 98 DF, p-value: 2.661e-06
Because P-values close to 0 can be rejected h*0: β*1 = 0 hypothesis
E
> lm.fit2=lm(y~x2)> summary(lm.fit2)Call:lm(formula = y ~ x2)Residuals: Min 1Q Median 3Q Max -2.62687 -0.75156 -0.03598 0.72383 2.44890 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.3899 0.1949 12.26 < 2e-16 ***x2 2.8996 0.6330 4.58 1.37e-05 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 1.072 on 98 degrees of freedomMultiple R-squared: 0.1763, Adjusted R-squared: 0.1679 F-statistic: 20.98 on 1 and 98 DF, p-value: 1.366e-05
Because P-values close to 0 can be rejected h*0: β*1 = 0 hypothesis
F
Because X1 and X2 are collinear, it is difficult to distinguish their effects when X1 and X2 do linear regression, and when they do linear regression they are clear.
G
> X1=c (x1,0.1) > X1=c (x1,0.1) > X2=c (x2,0.8) > Y=c (y,6) > lm.fit1 = LM (Y~X1+X2) > Summary (lm.fit1) CALL:LM (Formula = y ~ x1 + x2) Residuals:min 1Q Median 3Q max-2.73348-0.69318-0.05263 0.66385 2.30619 coefficients: Estimate Std. Error t value Pr (>|t|) (Intercept) 2.2267 0.2314 9.624 7.91e-16 ***x1 0.5394 0.5922 0.911 0.36458 x2 2.51 0.8977 2.801 0.00614 * *---signif. codes:0 ' * * * ' 0.001 ' * * ' 0.01 ' * ' 0.05 '. ' 0.1 ' 1Residual standard error:1.075 on 98 Degrees of Freedommultiple r-squared:0.2188, adjusted r-squared:0.202 9 f-statistic:13.72 on 2 and 98 DF, p-value:5.564e-06> lm.fit2 = LM (y~x1) > Summary (LM.FIT2) call:lm (formula = y ~ X1) residuals:min 1Q Median 3Q max-2.8897-0.6556-0.0909 0.5682 3.5665 coefficients:esti Mate Std. Error T value Pr (>|t|) (Intercept) 2.2569 0.2390 9.445 1.78e-15 ***x1 1.7657 0.4124 4.282 4.29e-05 * * *---signif. codes:0 ' * * * ' 0.001 ' * * ' 0.01 ' * ' 0.05 '. ' 0.1 ' 1Residual standard error:1.111 on degrees of Freedommultiple r-squared:0.1562, adjusted r-squared:0.147 7 f-statistic:18.33 on 1 and p-value:4.295e-05> DF, lm.fit3 = LM (Y~X2) > Summary (LM.FIT3) call:lm (formula = y ~ X2) residuals:min 1Q Median 3Q max-2.64729-0.71021-0.06899 0.72699 2.38074 coefficients: Estimate Std. Error t value Pr (>|t|) (Intercept) 2.3451 0.1912 12.264 < 2e-16 ***x2 3.1190 0.6040 5.164 1.25e-06 * * *---signif. codes:0 ' * * * ' 0.001 ' * * ' 0.01 ' * ' 0.05 '. ' 0.1 ' 1Residual standard error:1.074 on degrees of Freedommultiple r-squared:0.2122, adjusted r-squared:0.204 2 f-statistic:26.66 on 1 and DF, p-value:1.253e-06
The new data causes the β1=0 hypothesis to not be rejected in Y1.
> par(mfrow=c(2,2))> plot(lm.fit1)
> par(mfrow=c(2,2))> plot(lm.fit2)
> par(mfrow=c(2,2))> plot(lm.fit3)
In the first and third linear regression models, the newly added points are high-weighted points.
> plot(predict(lm.fit1), rstudent(lm.fit1))> plot(predict(lm.fit2), rstudent(lm.fit2))> plot(predict(lm.fit3), rstudent(lm.fit3))
Only the second linear regression model has a normalized residual value greater than 3, which is an outlier.
ISLR Chapter III Application of Linear regression exercises answer (bottom)