ch8-Annual sales forecast for a car and enterprise-regression

Source: Internet
Author: User

Scatter chart

Curve linearization: Fitting linear model and curve fitting model after variable transformation

Non-linear model

The independence, normality and homogeneity test of residual error

Predicted value

1. Case background

Forecast sales for the next 2-3 years using car sales for the past 14 years. Variables: Time, Sales

2. Data understanding

Draw a scatter plot of time and sales, and find the following three key types of information:

Whether there is a quantitative correlation trend between variables;

If present, is linear or non-linear;

Whether there is a significant deviation point, it is possible to become a strong point of impact when modeling.

By observing the scatter plot, it is necessary to delete data for 1988-1992 years and convert the year 1993-2001 to 1-9 numbers.

Write the code or get it through the "convert-Calculate variable-Select Case" step.

3. Linear regression after variable transformation

Introduction to Linear regression models:

Basic structure of linear regression model

Yi=yiˆ+ei is the measured value = estimate + residual. The estimate is the average value of the dependent variable obtained through the model.

Due to the existence of residuals, how to get the best fit model, the least square method is generally used to fit the model, that is, to ensure the vertical distance between the measured points to the regression line and the smallest. namely ∑ei2=min

Common indicators

The partial regression coefficient, which is bi, indicates the degree to which the independent variable XI affects the dependent variable. For short, regression coefficient;

Scaling the partial regression coefficients: The coefficients of the regression model are established after the normal transformation of the respective variables. Used to compare the degree of influence of the respective variable on the dependent variable.

Coefficient of determination: the square of the corresponding coefficient of correlation. R2, which reflects the proportion of the total variance of the dependent variable that can be interpreted by a regression relation by an independent variable, that is, the percentage of variance that the independent variable can interpret is the sum of the gross variances. The adjusted decision coefficient is mainly used to compare the model fitting effect with different number of independent variables.

Applicable conditions of regression model

The independent variable and the dependent variable are linear relations;

Independence: independent of dependent variables, i.e., residuals are independent of each other;

Normality: The dependent variable obeys the normal distribution, that is, the residuals obey the normal distribution;

Homogeneity: The variance of the dependent variable is the same, that is, the variance of the residuals is the same.

To fit a linear regression model after a variable transformation :

Sales and time are non-linear, pre-fitting two square curves. Variables used: Sales, time, time squared, linear correlation.

The judgment of the model fitting effect: (Test the independence and homogeneity of residual error)

The residual independence test has the following three methods: DW test, drawing residual distribution map, drawing residual time sequence diagram. The following are the differences:

The third method: Draw the residual sequence diagram, first save the standardized residuals, and then in the "analysis-prediction-time series diagram" to draw the residual time series diagram.

Store forecast and Interval estimates:

Because it predicts sales for the next 2-3 years, add three records in time and then perform regression analysis to save the selected predicted values.

4. Curve Fitting

The curve estimation process is used to fit multiple curve models at the same time:

Model Fitting effect judgment:

Plot the time series of residuals to test the independence of residuals is self-correlation, other methods of testing relevance? ; the P-n graph of the residual error is plotted to verify the normality of the residuals.

Prediction of the Model:

Predictions of the same third section.

5. Using nonlinear regression to fit

Linear regression and curve fitting after variable transformation are the strategies of using curve linearization, which may make the model not optimal or not find the right curve expression at all. In this time, nonlinear regression is needed.

The general form of a nonlinear regression model:

Yi=f (x,θ) +ei

The idea of nonlinear model estimation parameters is similar to the linear model, even if the residual squared and =min parameters are estimated, just at this time the model regression line is the curve.

To build a segmented regression model:

Attention:

Regression results: The standard error of parameter estimation is approximate standard error, so the corresponding confidence interval is for reference only, and all output results do not give the test result of the parameter.

In the results of Anova Anova, because of the nonlinear regression, the result does not give the F-value and P-value of the variance analysis.

Comparison of different model effects:

The predicted values of the three-square and segmented regression models are compared in the analysis-prediction-time series diagram.

ch8-Annual sales forecast for a car and enterprise-regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.