Introduction to machine learning algorithms (i) the gradient descent method to realize the linear regression __ algorithm

Source: Internet
Author: User
1. Background

The background of the article is taken from an Introduction to gradient descent and Linear regression, this paper wants to describe the linear regression algorithm completely on the basis of this article. Some of the data and pictures are taken from the article. There is not much time to dig into the details, so it is inevitable that there are any gaps in the error.

The goal of linear regression is simply to use a line to fit these points and to make the error between the point set and the fitting function the least. If the function curve is a straight line, it is called a linear regression, and if the curve is a two-time curve, it is called a two-time regression. The data comes from the Data.csv file in Gradientdescentexample, a total of 100 data points, as shown in the following figure:

Our goal is to use a straight line to fit these points. Since it is two-dimensional, then the formula y=b+mx y = B + mx is believed to be familiar to Chinese students. where b b is the Intercept (y-intercept) of the line in the Y-axis, M is the slope of the line (slope). The process of finding the best fitting line is actually looking for the best b b and M M. In order to find the best fit line, here we first define what line is the best line. We define error (cost function): Error function errors (b,m) =1n∑1n ((B+MXI) −yi) 2 error functions \ error_{(b, M)}=\frac{1}{n}\sum_{1}^{n} ((b+mx_i)-y_i) ^{2}

The Python code that calculates the loss function is as follows:

# y = b + mx
def compute_error_for_line_given_points (b, M, points):
    totalerror = SUM ((((b + M * point[0))-point[ 1]) * * * 2 for Point in points)) return
    Totalerror/float (len (points))

Now the problem is transformed into, looking for the parameters B and M m, so that error function errors (b,m) error_{(b, M)} have a minimum value. Here, both Xi X_i and Yi Y_i are considered known values. According to the figure below, the least squares method is to calculate the lowest point directly by mathematical deduction, and the gradient descent approach is to start from any spot in the graph to find the lowest level of the graph gradually.

2. Multivariate linear regression model

From the angle of machine learning, the above data has only one feature, so we can use a linear regression model. Here we generalize the conclusion of the one-dimensional linear model to the multivariate linear regression model. This part is referred to in the Machine Learning Mathematics (1)-Regression (regression), gradient descent (gradient descent). Suppose there are X1 x_1, X2 x_2, ..., xn x_n a total of n n feature,θ\theta for X

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.