8.3 Regression Diagnosis
> Fit<-lm (weight~height,data=women)
> par (mfrow=c (2,2))
> Plot (FIT)
To understand these graphs, let's review the statistical assumptions of OLS regression.
Mouth normality when the Predictor value is fixed, the dependent variable is normally distributed, and the residual value should also be a normal distribution with a mean of 0. A normal q-q graph (normal q-q, upper right) is a probability map of the normalized residuals under the corresponding value of the normal distribution. If the normal hypothesis is satisfied, then the point on the graph should fall on a straight line at a 45-degree angle, and if not, it violates the hypothesis of normality.
Port independence You cannot tell from these graphs whether the dependent variable values are independent from each other and can only be verified from the collected data. In the above example, there is no priori reason to believe that a woman's weight can affect another woman's weight. If you find that data is sampled from a family, you may have to adjust the hypothesis of model independence.
If the linear dependent variable is linearly correlated with the independent variable, then the residual value and the predicted (fitted) value have no system association. In other words, in addition to self-noise, the model should contain all the system variances in the data. A curve relationship can be clearly seen in the residual plot and fit graph (residuals vs fitted, upper left), suggesting that you may need to add a two-time term to the regression model.
If the variance of the port satisfies the invariant variance hypothesis, then the points around the horizontal line should be randomly distributed in the position scale graph (scale-location graph, bottom left). The diagram seems to satisfy this assumption. The last "residuals and leverage graph" (residuals vs Leverage, lower right) provides information about individual observations that you might be interested in. Outliers, high leverage points and strong impact points can be identified from the graph.
8.3.2 Method of improvement
Qqplot () The comparison chart of the number of bits
Durbinwatsontest () Durbin-watson test of error autocorrelation
Crplots () composition and residual plot
Ncvtest () A score test for the non-constant error variance
Spreadlevelplot () Dispersion level test
Outliertest () Bonferroni outlier test
Avplots () added variable graphic
Inluenceplot () regression effect diagram
Scatterplot () Enhanced scatter plot
Scatterplotmatrix () Enhanced scatter graph matrix
Vif () Variance expansion factor
R in Action reading notes (9)-eighth chapter: Regression-regression diagnosis