Focus on induction
Regression analysis is the use of samples (known data) to produce a fitting equation, thus (to unknown data) import line prediction
Use: prediction, discriminant rationality
Example: using height to predict weight, using advertising expenses to forecast merchandise sales, and so on.
Linear regression analysis: unary linear, multivariate linear, generalized linear
Nonlinear regression analysis
Difficulty: Select variables (multivariate), avoid multiple collinearity, observe fitted equations, avoid overfitting, and verify that the model is reasonable
Correlation coefficient
One-dimensional linear regression model
How to determine parameters
Use squared error and measure the difference between the false value of the predicted value
Squared error True Value Y, predicted value, then squared error is
Find the right parameters to make the squared error and the minimum.
Least squares:
RSS is actually a function of α-β, respectively, the α-not β-biased derivative and so that the bias is equal to 0, you can obtain the value of α-β
Due to the overall unknown, the sample values are estimated:
One-element linear regression analysis
Multivariate linear regression model
The core problem of multivariate linear regression: Which variables should be selected?
An atypical example (Shiry book p325)
RSS (residuals squared sum) and R2 (correlation coefficient squared) selection method: traverse all possible combinations to select the model that minimizes RSS and R2 the largest
AIC (Akaike information criterion) guideline does not BIC (Bayesian information criterion) guidelines
Aic=n ln (rssp/n) +2p
n is the total number of variables, p is the number of variables selected, the smaller the AIC the better
Stepwise regression
Forward Introduction method: From the beginning of the unary regression, gradually increase the variables, so that the indicator values to achieve optimal
Backward culling method: Start with the full-variable regression equation, and gradually delete a variable, so that the value of the indicator to achieve optimal
Stepwise Screening Method: Synthesis of the above two methods
Generalized linear model
Logit transform
、
Common connection function non-inverse join function
The 2nd week of machine learning---into gold-----linear regression and logistic