Machine Learning Algorithm---linear regression

Source: Internet
Author: User

A brief introduction of linear regression algorithm

linear regression is a statistical analysis method using regression analysis in mathematical statistics to determine the quantitative relationship between two or more variables, which is widely used. Its expression is y = W ' x+e,e is a normal distribution where the error obeys the mean value of 0.

In regression analysis, only one argument and one dependent variable are included, and the relationship of the two can be approximated by a straight line, which is called a linear regression analysis. If the regression analysis includes two or more two independent variables, and the dependent variable and the independent variable are linear, then the multivariate linear regression analysis is called.

This paper mainly introduces the deduction of linear regression algorithm, and the introduction of linear regression in Baidu Encyclopedia is introduced in detail.

Linear regression algorithm is the basic algorithm in machine learning, so it is best to fully understand the algorithm for readers who want to learn machine learning.

Deduction and deduction of linear regression algorithm

  Suppose that the amount of the application card in the bank is related to the following two parameters, that is, age and salary, there is one applicant's information, so how do you know the age and salary of a person to predict the amount of credit he can apply for?

For a linear relationship, we use the y=ax+b representation, but in this relationship y is affected by only one x, and the relationship can be approximated by a straight line, which is also called a unary linear regression. In this case, the set amount is H, the wages and the age of X1 and x2, respectively, can be expressed as the next formula, in this relationship results received multiple variables, called multivariate linear regression analysis.

We represent the θ and x in the upper form as two one-dimensional matrices [θ0θ1θ2] and [x0 X1 X2], which can be made (x0=1).

The actual results can not fully meet our calculation results, so there must be an error between the two, assuming that for the first sample, there is the following relationship, which is the true error.

The error is independent and has the same distribution (usually considered to be a Gaussian distribution with a mean of 0).

So you can get the following formula:

            

So, if there are a large number of samples, we can pass and do the parameter estimation about θ,

The likelihood function is as follows:

          

To calculate the logarithm of a type:

        

The maximum likelihood estimation of θ can be obtained by the derivation of the upper formula so that its value is 0.

In the above formula, the two parts marked are constants, the first part is zero after derivation, and the latter part is a factor, which will not affect the final result. So, for the final result, just let the unmarked part take the derivative to 0. So make:

The formula is simplified and the θ is biased:

        

The maximum likelihood estimation (least squares) of θ can be obtained by setting the value of the derivative to 0.

        

After we get theta, we train a linear regression model from the sample, and we can use the data that is unknown to the result to predict.

PS: Readers only need to understand the derivation process of the algorithm, for the calculation of data, programming can be solved, without manual calculation (for multidimensional matrices, the computational amount is quite large, and it is easy to calculate the wrong ( ̄▽ ̄)).

Machine Learning Algorithm---linear regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.