Machine learning python combat----linear regression

Last Update:2017-11-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Outline

Normal equation method for linear regression

Local weighted linear regression

Ii. details of the contents

　　1. Normal equation solution of linear regression

Linear regression is the prediction of a continuous type of data. The example of linear regression is discussed here, and the nonlinear regression is not discussed first. This part of the content we use is the normal equation solution, the theoretical content has been explained before, the normal equation is θ= (XT X) -1 XT Y. It is important to note that there is a need for XT X is the inverse matrix, so this equation only applies when the inverse matrix exists, so it needs to be judged in the code.

 fromNumPyImport*ImportMatplotlib.pyplot as Pltdefloaddataset (filename): Numfeat= Len (open (filename). ReadLine (). Split ('\ t'))-1Datamat= [];labelsvec =[] File=open (filename) forLineinchfile.readlines (): Linearr=[] CurLine= Line.strip (). Split ('\ t')         forIinchRange (numfeat): Linearr.append (float (curline[i)) datamat.append (Linearr) labelsvec.append (f Loat (curline[-1]))    returnDatamat,labelsvecdefstandregression (Xarr,yarr): Xmat= Mat (Xarr); Ymat =Mat (Yarr) xTx= xmat.t *XmatifLinalg.det (xTx) ==0.0:        Print('This matrix was singular,cannot do inverse\n')        returnSigma= XTX.I * (XMAT.T *ymat.t)returnSigma

The Loaddataset () function divides the text data into special solicitation and labeling. Standregression () is to use the normal equation to find the regression coefficient sigma, of course, before using the normal equation need to determine whether there is a inverse matrix. This solution is very simple, but its shortcomings I also said in the previous part of the theory. Let's take a look at the results of the fit and use the plotline () function to paint. Note that the incoming parameters of this function xmay and ymat need to be in matrix form

def plotline (xmat,ymat,sigma):     = Plt.subplot (111)    Ax.scatter (xmat[:,1].flatten (). A[0],ymat.t[:,0].flatten (). A[0])    = xmat.copy ()    xcopy.sort (0)    = xcopy*sigma    Ax.plot (xcopy[:,1  ],yhat)    plt.show ()

We get the fit line, which looks a little less fit. If you use a different dataset, the resulting fitting line is the same, and that's what we don't want.

Therefore, the method is improved and the regression coefficients are locally weighted. The method here is called local weighted linear regression (LWLR)

　　2. Local weighted linear regression

In this algorithm, we give a certain weight to each store near the predicted point, and then carry on the normal linear regression on the basis of the minimum mean variance. Its normal equation becomes θ= (XTX) -1xtwy. Here the W is weighted. LWLR uses "cores" to give higher weights to nearby points, the most common being the Gaussian nucleus, which weighs. This constructs a weight matrix with only diagonal elements, and the closer the point x is to the X (i), the greater the weight.

defLWLR (Testpoint,xarr,yarr,k = 1.0): Xmat= Mat (Xarr); Ymat =Mat (Yarr). T m=shape (Xmat) [0] weights=Mat (Eye (m)) forIinchRange (m): Diffmat= TestPoint-xmat[i,:] weights[i,i]= exp (Diffmat * diffmat.t/( -2.0*k**2)) Xtwx= xmat.t * (weights*Xmat)ifLinalg.det (XTWX) ==0.0:        Print('This matrix was singular,cannot do inverse\n')        returnSigma= XTWX.I * (XMAT.T * (Weights *Ymat)) returnTestPoint *SigmadefLwlrtest (Testarr,xarr,yarr,k = 1.0): M=shape (Testarr) [0] Yhat=zeros (m) forIinchRange (m): Yhat[i]=LWLR (testarr[i],xarr,yarr,k)returnYhat

The LWLR () function is the code for locally weighted linear regression, and the function of the lwlrtest () function is to make the LWLR () function traverse the entire data set. We also need to draw a picture to see how the results fit.

 def  PlotLine1 (testarr,xarr,yarr,k = 1.0 = Mat (Xarr) ymat  = Mat (Yarr) Yhat = Lwlrtest (testarr,xarr,yarr,k) srtind  = xmat[:,1 ].argsort (0) xsort  = xmat[srtind][:,0,:] Ax = Plt.subplot (111 1].flatten (). A[0],ymat.t[:,0].flatten (). A[0],s = 2,c =  " red   " ) Ax.plot (xsort[:,  1],yhat[srtind]) plt.show ()

When k=1.0 k=0.01 k=0.003

K=1.0 is the front of the under-fitting state, and k=0.003 is over-fitted state, so when the k=0.01 is a better return.

Datasets and code: HTTP://PAN.BAIDU.COM/S/1I5AAYXN

Machine learning python combat----linear regression

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine learning python combat----linear regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine learning python combat----linear regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support