In many practical problems, there are more than one independent variable that affects the dependent variable Y, usually set to P. At this time, the model cannot be determined with the help of graphics, here, we use a simple and universal model-a multivariate linear model for regression computing.
1. Mathematical Model
When the factors that affect the Y value are not unique, we can use the multivariate linear regression model:
When there are two unknown parameters, we can plot the graphical representation of this equation (a plane at this time ,). If the number of Unknowns is greater than 2, it is difficult to draw the hyperplane.
For example, to obtain the values of other parameters, we create a straight line parallel to the Y axis from each sample point (red point in the figure, then, we obtain the degree of the line segment, perform the square, and add the sum of the squares of all these line segments together to obtain the sum of the residual squares. The formula is as follows:
When the sum of the squared residual values is the least, this linear regression is the best. Therefore, here we use RSS to calculate the deviation and obtain a system of m + 1 equations, so that we can obtain the equal value, the corresponding linear regression equation is obtained.
2. r regression modeling
A built-in data set Swiss is provided in R, which is based on various factors that affect the national economy in Switzerland in 1888. For details about this data, enter help (Swiss) in R) or directly access the ingress:
The following describes the regression model between fertility and agriculture, examination, education, Catholic, and infant. Mortality.
Use the r regression function lm for regression:
Use the summary function to view the regression result:
At this time, the P value is very small. We can know that the model is correct. According to the number of values in *, examination has almost no relevance. Therefore, you can remove this parameter in the future.
[Machine learning practice] multiple linear regression model