Least squares learning (shared from other bloggers)

Last Update:2016-03-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Two. Least squares

We explain the least squares by the simplest linear model of one element. What is a unary linear model? In supervised learning, if the predicted variable is discrete, we call it classification (e.g. decision tree, support vector machine, etc.), if the predicted variable is continuous, we call it regression. In regression analysis, if you include only one argument and one dependent variable, and the relationship between the two can be approximated by a straight line, this regression analysis is called unary linear regression analysis. If the regression analysis includes two or more two independent variables, and the dependent variable and the independent variable are linear, then the multivariate linear regression analysis is called. For two-dimensional space linear is a straight line, for three-dimensional space linear is a plane, for multidimensional space linear is a super plane ...

For a unary linear regression model, assume that n groups of observations (X1,y1), (X2,y2), ..., (Xn,yn) are obtained from the population. For these n points in a plane, you can use an infinite number of curves to fit. The sample regression function is required to fit this set of values as well as possible. Together, this line is the most reasonable in the central position of the sample data. The criteria for selecting the best fit curve can be determined as follows: The total fit error (i.e. total residuals) is minimized. The following three criteria can be selected:

(1) It is a way to determine the linear position with "residual and minimum". But it was soon found that the calculation of "residuals and" there was a problem of offsetting each other.
(2) It is also a way to determine the straight position with "absolute residuals and minimum". But the calculation of absolute value is more troublesome.
(3) The principle of least squares is to determine the linear position with "residual squared and minimum". In addition to the least-squares method, the obtained estimators have good properties. This method is very sensitive to outliers.

Common least squares (ordinary Least square,ols) are most commonly used: The selected regression model should minimize the sum of the residuals of all observations. (q is the sum of squares of residuals)-the square loss function is used.

Sample regression Model:

where ei is the error of the sample (Xi, Yi)

Square Loss Function:

Then, by using the Q-min to determine the straight line, which determines that the variable is considered as the function of Q, it becomes an extremum problem, which can be obtained by the derivative number. Ask Q for a partial derivative of two parameters to be evaluated:

According to mathematical knowledge, we know that the extreme point of a function is a point with a bias of 0.

Solution to:

This is the method of least squares, which is to obtain the extremum point of the square loss function.

Four. Least squares and gradient descent method

The least squares and the gradient descent are the minimum values of the loss function by derivation, what is the difference between them?

Same

1. Essentially the same: both methods are based on the given known data (independent & dependent variables) to the dependent variables calculated a general valuation function. The dependent variables of the given new data is then estimated.
2. The same goal: all within the framework of the known data, so that the total squared difference between the estimated value and the actual value is as small as possible (in fact, it is not necessarily necessary to use the square, in the later post on the gradient rise, is the logistic regression ), the formula of the total squared difference between the estimated value and the actual value:

The independent variable for the group I data, the dependent variable for the group I data, is the coefficient vector.

Different
1. The implementation method and the result are different: the least squares is the direct derivation to find the global minimum , non-iterative method. The gradient descent method is an iterative method, which is first given and then adjusted to the fastest descending direction, and the local minimum is found after several iterations. The disadvantage of gradient descent method is that the convergence speed slows down to the minimum point, and the selection of the initial point is very sensitive, and the improvement is mostly in these two aspects.

Least squares learning (shared from other bloggers)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Least squares learning (shared from other bloggers)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Least squares learning (shared from other bloggers)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support