Machine learning Algorithms Interview-Dictation (5): Regression

Last Update:2015-08-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This series is to deal with the job interview when the interviewer asked the algorithm, so just also thanks to the brief introduction of the algorithm, the latter will be supplemented in the

Algorithm of Common polygon problems.

I. Logistic regression

First, the logistic regression, which is based on the existing data on the classification of the boundary of the regression formula, to classify. The calculation cost is not high, it is easy

Realization and understanding, but easy to be fit, classification accuracy is not too high;

Logistic regression can be considered as a probability estimate, using the Sigmioid function,

By training the data to train the parameters [W1, W2, ..., WN], the values of h (x) can be calculated according to the parameters of the training, comparing their

With a size of 0.5 (greater than one category, less than another). The key question now is how to

Training data to get training parameters?

1. Using gradient Rise method

The best way to find the maximum value of a function is to look in the gradient direction of the function.

Just start to give the same weight, and then according to the results and known labels to calculate the training error, and then update the weights, and constantly iterate

until stop if certain conditions are met. But this place has a point to take care of--this method only tries with some very small

Data Set, because the entire data set is used when the weights are updated, this can make the speed much slower.

2. There is an improved method for the above problem-the stochastic gradient rise method, unlike the gradient rise method, uses the entire

Data Set Update data, the random gradient rise method uses a current data point to update the weights, so that the basic

does not involve vectors on the operation, the preceding H (x) and error errors are a vector, but the random gradient rise method

Both of these are converted from vectors to A numeric value, the speed of a large increase. However, since it is chosen as a random sample

this point to update the data, so sometimes this There are local fluctuations in the method, which can affect the results of precision.

3, for the above fluctuation problem, can be improved by two steps: first, each iteration of the time to change the step A, two is

using random samples to update the weight value.

Second, linear regression

The goal of regression is to predict the target value of numerical type, the simplest method is to calculate the target's formula according to training data, and linear regression is to use a straight line to fit the data to achieve the prediction of the target value.

The goal is to find the front regression coefficients

Using least squares to get results

Of course there is an inverse process in the result, it is necessary to verify that the inverse exists!

Third, local linear weighted regression

One of the more serious problems with linear regression is under-fitting (because it asks for unbiased estimation of the minimum mean squared error) and can be used to improve the problem by using local linear weighted regression-giving a certain weight to each point near the predicted point

The results obtained are:

This approach increases the precision of the fitting, but it also increases the number of calculations, each of which is distance from all training samples.

The above two or three can be done in the case of inverse existence, but what if the characteristics of the data are more than the sample points, because the inverse is not present at this time? You can use the ridge regression method to solve this problem, that is, it will be converted to, the other and the previous approach is similar.

Of course, there is a method called forward stepwise regression, it is through each step to a certain weight increase or decrease a small value, and then recalculate W and error, if the error is smaller, then update W.

For a detailed introduction to regression, refer to: http://www.cnblogs.com/jerrylead/archive/2011/03/05/1971867.html

By the way, csdn under the trough, after editing has not seen the back, a line sometimes can only write half of the place ...

Machine learning Algorithms Interview-Dictation (5): Regression

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine learning Algorithms Interview-Dictation (5): Regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine learning Algorithms Interview-Dictation (5): Regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support