Open Course Notes for Stanford Machine Learning (I)-linear regression with single variables

Source: Internet
Author: User

Public Course address:Https://class.coursera.org/ml-003/class/index 

INSTRUCTOR:Andrew Ng

1. Model Representation ( Model Creation )

Consider a question: what if we want to predict the price of a house in a given area based on the house price and area data? In fact, this is a linear regression problem. The given data is used as a training sample to train it to get a model that represents the relationship between price and area (actually a function) and then use this function for prediction. The basic process is as follows:

2. Cost Function ( Cost functions )

PS:In fact, there is no price function in this section.

Since we have made it clear that we only need to train a function, the first thing we need to do is to make assumptions about the function form. Here we can assume the simplest linear function:

The following questions change to how to findThetaValue. Since we already have some training data, although we do not know whether the data is linearly related as we suppose, we don't mind a little deviation. We just need to make the function value as close as possible to the actual value. Here let'sXArea,YThe price. A series of points are shown on a two-dimensional plane.


3. Cost Function intuition 1 ( Preliminary cost functions 1)

We have already explained howHThe error between the function value and the actual value is as small as possible. Here is a clearer description:


Cost functionsJIt represents the error mentioned above. Here we write this form for the convenience of subsequent derivation functions. Our goal is to make the cost functionJThe minimum value. Note thatJMediumThetaChanged to a variable,MIndicates the number of training samples (that is, the number of point in the coordinate system ).

4. Cost Function intuition 2 ( Preliminary cost functions 2)

InTheta0AndTheta1We can find the correspondingHFunctions (a line) and costJ(A value), when we put the two together for comparison and observation, we can clearly see that the value of substitution lies inHThe fitting between a function and a sample point is always the smallest, and there is a globally unique minimum value. Cost functionsJAn Approximate 3D representation is shown in figure:




As you can seeHFunction and sample point fitting is the best, and cost functionJThe minimum value is also obtained,Theta0AndTheta1The value can beJThe horizontal and vertical coordinates.

5. Gradient Descent ( Gradient Descent )

To makeJMinimum, our idea is to changeThetaTo change the valueJ. HereThetaThe initial value is not required. Only the changes are considered. The falling gradient here means followingJFunction gradient direction changeThetaLetJThe value is reduced. The Visualized representation is as follows:


If we know a little about calculus, we can use the mathematical formula below to express this descent process:

WhereAlphaIs the learning rate (greater0), Which can be understood as the step size for each descent, which must be set manually.

6. Gradient Descent intuition ( Initial Gradient Descent )

We can perform a simple verification of the gradient descent formula above, for example, whenThetaWhen the value is too large, the gradient is positive, and each iterationThetaDecrease,JThe value is also reduced.ThetaThe value is less than an hour.ThetaIncrease,JThe value is also reduced. Therefore, the idea of gradient descent is correct.


In contrast,AlphaThe selection is not that simple,AlphaModerate selection requirements, too large or too small are not good:


As mentioned,AlphaThe value is too small, the step size is too small, it takes many steps to reduce to the minimum value, the processing speed is too slow. WhenAlphaWhen the value is large, the step size is too large, and it will fluctuate around the minimum value, and never reach the minimum value. However, even ifAlphaIf the value is moderate, we may also fall into the local minimum, which cannot reach the global minimum.


Add that even ifAlphaThe value is fixed, and the step size will automatically decrease in the gradient descent process, so we do not need to decrease when the function approaches the minimum value.AlphaTo avoid skipping the lowest point if the step is too large.

7. Gradient Descent for Linear Regression ( Gradient Descent for Linear Regression )

For linear regression, because we already have a cost functionJIn the following format:


It looks better to write like this:

Through iteration, we can reachJThe minimum value, which is not necessarily the global minimum.


------------------------------------------ Weak split line --------------------------------------------------------------

The above is the first article about linear regression with single variables. The idea is quite clear. First, the model function form is defined based on the training data, and the cost function is obtained by calculating the error with the actual value. Then, the parameters of the model function are determined by falling the gradient of the cost function. After you have determined the parameters, we can use this model to make predictions. However, because this is linear, there is only one variable, which is not very accurate. Naturally, we should introduce multi-variables and non-linear situations. That must be more complicated.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.