Machine Learning-week 2-multivariate Linear Regression

Source: Internet
Author: User

Gradient descent in Practice-feature Scaling

Make sure features is on a similar scale.

The smaller the range of Features, the less likely the total is, and the faster the calculation will be.

Dividing by the range

By Feature/range each feature within the range of [-1, 1]

The next question is an example:

Mean Normalization

Changes the value to close to 0. Except for x0, because the value of x0 is 1.

MU1 is average value of X1 in trainning sets;

S1 is the size of the X1, such as the bedroom is [0, 5], then the range is 5-0 = 5.

Ensure gradient descent work correctly

For example, this image is correct, and as the number of cycles increases, the J (θ) primary key decreases. After a certain number of cycles, the J (θ) curve tends to flatten. Can be based on the image to see when to stop, or when each cycle, J (θ) change is less than ε stop.

Image rise

The value of α is large and should be reduced. The actual image may look like this:

If α is small enough, it can be slow but completely covered.

If α is too large: at each cycle, it may not be reduced so that it cannot be completely overwritten.

Features and polynomial regression

You can use a custom features instead of completely copying an existing features. For example, the house has a length and width of two properties, we can create a new property--area. The expression then becomes

, but this curve is reduced and then enlarged, and the actual data does not match (the larger the area, the higher the total price). So adjust to

Normal equation

Gradient descent gradually approximates the minimum value as the number of cycles increases.

Normal equation is calculated directly by means of θ.

The derivative is 0 o'clock min.

and solve the θ0 to Partθn.

Solving the equations of θ

Matrix concept See machine Learning-week 1

When to use Gradient descent or Normal equation

When n is large, the right side will be slow because the calculation is O (n3)

When n is small, the right side is faster because it is directly derived and does not require iterations or feature scaling.

What if it's non-invertible?

1. Redundant features (is not linearly independent).

e.g. X1 = size in Feet2; x2 = size in m2

2. Too many features (e.g. m <= N)

For example m = ten, n = 100, meaning you have only 10 data, but there are 100 features, obviously, the data is not enough to cover all the features.

You can delete some features (keep only data-related features) or use regularization.

Exercises

1.

Don't know how to use both methods at the same time, are these two methods sequential related?

Use dividing by the range

Range = Max-min = 8836-4761 = 4075

Vector/range after change to

1.9438
1.2721
2.1683
1.1683

For the above use mean normalization

AVG = 1.6382

Range = 2.1683-1.1683 = 1

X2 (4) = (1.1683-1.6382)/1 = 0.46990 reserved Two decimal places for-0.47

5.

As mentioned above, "the smaller the scope of the Features, the smaller the overall probability, and the faster the computational speed." (Multiple choice can also be selected)

Machine Learning-week 2-multivariate Linear Regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.