Linear regression--least squares method (I.)

Source: Internet
Author: User

I believe that we have learned the linear regression of mathematical statistics (linear regression), this article will explain the univariate linear regression and write out the use of least squares method (least squares) In order to find the complete process of the optimal solution of the linear regression loss function, the least square method is deduced first, and then the linear regression is fitted with the least squares algorithm for a simple data set.

Linear regression

Linear regression assumes that there is a linear relationship between the characteristics of the data set and the result;

Equation: y = mx + c

  Y is the result, X is characteristic, M is the coefficient, C is the error in mathematics m is gradient c is intercept

This equation for us assumes that we need to find M, c so that the results obtained by MX+C are the least accurate with the true Y error, where the squared difference is used to measure the estimated value and the true worth of errors (if only the difference is possible with negative numbers); The function used to calculate the error of the real and the predicted values is called: The square loss function (Squard loss functions), where L is used to represent the loss function, so there are:

The average loss on the entire data set is:

We require the most matched m and C to make l the smallest;
A mathematical expression can be expressed as:

  The least square method is used to find the optimal value of the objective function, and it is called as the least squares algorithm by minimizing the squared error and finding the match. The least squares is used to obtain the optimal solution of the linear regression.

Least squares

In order to make it easier to explain the least squares derivation process used here, the dataset has 1 ... n data composition, each data consists of, composition, X represents the feature, Y is the result; here the linear regression model is defined as:

The average loss function is defined as:


Requirements of the smallest, its about C and M of the partial derivative is set to 0, so the partial derivative, the derivative is equal to 0, and the C and M solution will be able to get the smallest l at this time C and M is the most matching the model;

About C Derivatives:

Because the derivation is about the partial derivative of C, it removes the term "l" from the equation that does not contain C:

The arrangement of the formula does not include the subscript n of the accumulation and the outer movement is obtained:

The partial derivative of C is obtained:

About the partial derivative of M:

For the partial derivative of M, so that the L equation does not contain the items removed:


The arrangement of the formula does not include the subscript n of the accumulation and the outer movement is obtained:

A partial derivative of M is obtained:

Make the partial derivative of C equal to 0, solve:

It can be seen from the above solution that there are two mean values in the above equation, so it can be rewritten as well:

Make the partial derivative about m equal to 0, solve:
The partial derivative of M depends on C, and because the solution to the C derivative is obtained, the partial derivative of the solution algebra about the C partial derivative is derived as follows:


Merging the items containing M:


Solving:


To simplify the equation, define the following:

Example:

The following data sets are linear-fitted using the least squares formula obtained above:

N x y XY x^2
1 2 4 8 4
2 6 8 48 36
3 9 12 108 81
4 13 21st 273 169
Average 7.5 11.25 109.25 72.5

Distribution of data points:

Calculates the optimal current data set based on the appellate least squares formula: M and C

c = 11.25-1.5307 * 7.5 = 0.23

Finally, we conclude that the current linear function is:

y = 1.5307x-0.23

Calculate the predicted values for each node:

Y1 = 1.5307 * 2-0.23 = 2.83
y2 = 1.5307 * 6-0.23 = 8.9542
Y3 = 1.5307 * 9-0.23 = 13.5463
Y4 = 1.5307 * 13-0.23 = 19.6691

To fit the result:

Resources:
Https://zh.wikipedia.org/zh/%E6%9C%80%E5%B0%8F%E4%BA%8C%E4%B9%98%E6%B3%95
A first course in machine learning

Article starting address: Solinx
http://www.solinx.co/archives/648

Linear regression--least squares method (I.)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.