In the previous article, we introduced the univariate linear regression , why is the time single variable, because it has only a single feature, in fact, in many scenarios only a single feature is far from enough, when there are multiple features, we use the previous method to find the characteristic coefficients is very troublesome, Need a characteristic coefficient a partial derivative, but the most deadly when the characteristics of the growth and its rapid, dozens of, hundreds of, thousands of ...
Single-Variable linear regression:
Multivariable linear regression:
So from here we begin to introduce another way of linear regression, which is more convenient to solve multivariable linear regression : The least squares matrix form ;
Model transformations
scalar form of linear regression:
This converts the coefficient m in the appeal equation to the vector (in order to unify starting from below)
C and M), Bashi allele C as 1c, 1 and feature X are also converted to vectors;
So there are:
Loss function
The loss function can also be changed to:
The loss function based on the matrix product transpose rule can be further simplified to:
Partial derivative
Or the same as before. The minimum value of the loss function L is obtained, so the partial derivative of W is obtained.
Common equations for vector differentiation
Ask for the partial derivative of W:
Because
Then there are:
W is the solution of the least multiplication obtained by the matrix form;
Example
The following is a linear fit using the last set of data, followed by a linear fitting of the validated appeal algorithm with multivariate data:
Single-Variable linear regression example:
The following data sets are linearly fitted using the least squares matrix form formula obtained above:
N |
x |
y |
1 |
2 |
4 |
2 |
6 |
8 |
3 |
9 |
12 |
4 |
13 |
21st |
The matrix for x and Y is:
Calculate W according to the formula
The following sub-request the whole equation, we can first decompose the formula;
So, that's c=-0.23092,m=1.53092.
Linear regression functions can be written as: y = 1.53092x-0.23092
Predict the value of y:
y = 1.53092 * 2-0.23092=2.83092
y = 1.53092 * 6-0.23092=8.9546
y = 1.53092 * 9-0.23092=13.54736
y = 1.53092 * 13-0.23092=19.67104
With the upper biased article directly to the results of M and c biased result is almost the same (because the decimal point is different, so the accuracy varies); In the next article we will use the least squares matrix in this article to deal with multivariable situations;
Resources:
Https://zh.wikipedia.org/zh/%E6%9C%80%E5%B0%8F%E4%BA%8C%E4%B9%98%E6%B3%95
A first course in machine learning
Article starting address: Solinx
http://www.solinx.co/archives/721
Linear regression-least squares method (two)