Machine Learning DAY13 machine learning Combat linear regression

Last Update:2016-04-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

These two days to learn the relevant knowledge of the return, there would have been a real case, but can not be able to crawl information on the site can not be achieved, there is a reduction of the way forward to the gradual return did not see, Lasso simplified version, only to see the ridge return.

The difference between regression and classification: the label value of the regression is a continuous number, we do the prediction of the unknown point of the label value, the classification of the label values for the category, we do is to predict the classification of other samples.

Calculation of correlation coefficients:

The absolute value of correlation coefficients can be obtained with Cauchy inequality |r| < 1. -1 < R < 1,r the closer the 1 or 1 correlations are, the more they can be expressed in straight lines, the R > 0 shows X, Y, and R < 0 shows X, Y and minus.

Line y = wx. The Intercept is x0 * w0, x0 is not an attribute, set to 1.0.

The criterion for measuring a fitting line is to calculate the sum of squares of the difference between a bit and a line y value,

The derivative

The inverse of the matrix is required here, the inverse of the matrix does not necessarily have, the sufficient and necessary condition of the matrix inverse is that the determinant of the matrix does not have a function in 0,numpy to implement Linalg.det (). The inverse numpy of the matrix also has corresponding functions.

Some details will be written later.

Step1:

Read simple data:

#读取简单数据def loaddataset (filename):    featruenum = len (open (filename). ReadLine (). Split (' \ t '))-1    dataset = []    lavels = []    f = open (filename) for    i in F.readlines (): Line        = []        L = I.strip (). Split (' \ t ') for        J In range (featruenum):            line.append (float (l[j))        dataset.append (line)        lavels.append (float (l[-1]))    return DataSet, Lavels

Step2:

Points in a visual dataset:

#画出ex0的数据点def drawpoint (DataSet, Lavels):    xtemp = Array (DataSet)    xx = xtemp[:, 1]    yy = Array (lavels) 
   fig = Plt.figure ()    ax = fig.add_subplot (111)    ax.scatter (xx, yy, s = d, c = ' green ')    fig.show ()

You can see that the points are basically distributed in a straight line, and we seem to be able to find a straight line that roughly fits the data.

Step3:

The least square method calculates the optimum coefficient and draws a fitting line.

#最小二乘法计算wdef getw (x, y):    Xmat = Mat (x)    Ymat = Mat (y). T    XT = Xmat. T    tempx = XT * Xmat    if Linalg.det (tempx) = = 0.0:        print "error"        return     w = tempx. I * XT * Ymat    Return W

#画出线性拟合直线def drawbestline (x, Y, W):    tempx = Array (x)    tempy = Array (y)    xx = tempx[:, 1]    fig = Plt.figu Re ()    ax = fig.add_subplot (111)    Ax.scatter (xx, tempy, s = $, c = ' red ')    xx.sort (0)    tempy = tempx * w
   ax.plot (xx, tempy)    fig.show ()

This fitting line is:

It can be seen that the fitting effect is satisfactory.

But we can also dig out a bit of the potential value of data, that is, the fitting line closer to the data, not necessarily a straight line.

We have introduced local weighted linear regression, referred to as LWLR.

According to the distance between the test point and the training concentration data, the training set data is set different weights, and the right value of the training point near the pilot is higher, which plays a more important role.

The weights are taken by Gauss nuclei, and the formula is not given here. There are two aspects that will affect the weight value, one is the distance, the second is the K value, k the greater the distance test points the same distance point weights, that is, the greater the K, the impact of the final line of the more points, the more inclined to linear regression, k smaller, the impact of the final fitting line of the less points, the more small local points to fit a line, Many parts will eventually synthesize a curve with a high fitting fit, which may not be good.

Let's take a look at this difference by setting up 3 different K.

STEP4:

The LWLR method gives 3 fitting lines of different k:

#局部加权线性回归 (for single data prediction) def LWLR (data, L, t, K = 1.0):    test = Mat (t)    Datamat = Mat (data)    Lavelsmat = Mat (l). T    m = shape (Datamat) [0]    w = Mat (Eye (m)))    for I in Range (m):         Diffmat = datamat[i]-Test         w[i, I] = exp (Diffmat * Diffmat. T/( -2.0 * k * * 2))    Tempmat = Datamat. T * (w * datamat)    if Linalg.det (tempmat) = = 0.0:        print ' ERROR '        return     weights = Tempmat. I * Datamat. T * W * lavelsmat    return Test * weights# to DataSet Lwlrdef TESTLWLR (data, L, testdata, K = 1.0):    m = shape (TestData) [0]    Ymat = zeros (m)    for I in Range (m):        ymat[i] = LWLR (data, L, Testdata[i], K)    return Ymat

#绘制LWLR拟合的回归线 (not necessarily straight line) def DRAWLWLR (d, Lavel, K):    data = Array (d)    L = Array (lavel)    tempx = data[:, 1]    Fig = Plt.figure ()    ax = fig.add_subplot (111)    Ax.scatter (TEMPX, l, s = $, c = ' red ')    temp = data.copy ()    TEMP.S ORT (0)    tempy = TESTLWLR (data, L, temp, k)    Ax.plot (temp[:, 1], tempy)    plt.show ()

K = 1 o'clock Fitting line:

The fitting line for k = 0.01 is already a smoother curve:

The fitting line for k = 0.003 is already appalling, although fitting the training data is good:

We see the above two methods of linear regression, but there is a special case, that is, when the number of features is larger than the number of times the inverse of the matrix can not be applied to the above two methods, the following describes a new method: Ridge regression.

Ridge regression can handle this situation, the formula is similar to LWLR, the formula is described in "machine learning combat". The formula adds a coefficient that we set ourselves, and we take 30 different values to see the change of W.

STEP5:

Ridge return:

#岭回归def ridgeregression (data, L):    Xmat = Mat (data)    Ymat = Mat (l). T    Ymean = mean (Ymat, 0)    Ymat = Ymat-ymean    Xmean = mean (Xmat, 0)    v = var (xmat)    Xmat = (Xmat-xmean) /V    #取30次不同lam岭回归的w    cycle =    Wmat = zeros (cycle, shape (Xmat) [1])) for    I in range (cycle):        TEMPW = RIDGEGETW (Xmat, Ymat, exp (i-10))        wmat[i,:] = TEMPW. T    return wmat# calculates the wdef of the ridge regression ridgegetw (x, y, Lam = 0.2):    tempx = x.t * x    temp = tempx + Lam * Eye (shape (x) [1]) C16/>if Linalg.det (temp) = = 0.0:        print ' ERROR '        return     w = temp. I * x.t * y                Return W

30 times after calculating the different weights, we visualize the weight values:

The horizontal axis is the log value of the value we set, when we set the value is very small, the result is similar to the linear regression w, but when we set a larger value, W will slowly become 0, does not play any role, interfering with our fitting.

We can also use this graph to find that those characteristics affect the predicted value of large, those that affect the predicted value is small, can also discard some of the weight of the characteristics of small, convenient for later processing.

This is the basic knowledge of the return, there is a forward stepwise regression and Lasso did not see, the case is not able to connect on the site failed.

Let's summarize the method of regression:

We can first find the important features, then visualize the data, observe the distribution of the data and then decide how to predict.

You can choose simple linear regression, if you want to dig the potential value of the data we use LWLR and set the appropriate K.

When the number of features is greater than the number of samples, we use the ridge regression method, multiple selection of different settings and cross-validation of an optimal w as a fitting weight value.

This chapter is not good enough to study cart tomorrow.

Machine Learning DAY13 machine learning Combat linear regression

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning DAY13 machine learning Combat linear regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning DAY13 machine learning Combat linear regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support