Linear regression---least squares and linear regression of pattern recognition

Last Update:2014-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

-----------------------------Author:midu

---------------------------qq:1327706646

------------------------datetime:2014-12-08 02:29

(1) Preface

Before looking at the least squares, has been very vague, the back yesterday saw the MIT linear algebra matrix projection and the least squares, suddenly a sense of enlightened, the teacher put him from the angle of the equation and the matrix, and have a different understanding. In fact, it is very simple to find the discrete distribution of points and close to the minimum distance between the line, because the distance is positive, in order to facilitate the operation to add a square, which is the least squares. Then saw the linear regression, borrow a netizen's piece of data processing blog to do the expansion.

(2) algorithm

Advantages and disadvantages of the algorithm:

Pros: Results are easy to understand, calculation is not complex
Cons: not fit for nonlinear data
Applicable data type: Numerical and nominal type

Algorithm idea:

This is calculated using the least squares method (which proves to be more verbose omitted). The advantage of this approach is that the calculation is simple, but requires the data matrix X full rank, and when the data dimension is high, the calculation is very slow, we should consider using gradient descent method or random gradient descent (the same as the idea of logistic regression, and simpler) and other solutions. The correlation coefficients are used to measure the quality of the estimates.

Data Description:

Here the TXT contains the value of x0, that is, a bunch of 1 in the front, but generally we do not give, that is, according to an X prediction Y, this time we will consider the calculation of the convenience also add a x0.

This is the data.

Function:

loadDataSet(fileName):
Read the data.
standRegres(xArr,yArr)
Normal linear regression, where the least squares are used

plotStandRegres(xArr,yArr,ws)
Draw the effect of fitting
calcCorrcoef(xArr,yArr,ws)
Calculate correlation, using numpy built-in functions

Results:

Local weighted linear regression (locally Weighted Linear Regression) algorithm idea:

The idea here is that we give each point near the prediction point a certain amount of weight, on which the normal linear regression is based on the minimum mean variance. In this case, the "kernel" (similar to the support vector machine) is used to give the nearest point the highest weight. The Gaussian kernel is used here:

Function:

lwlr(testPoint,xArr,yArr,k=1.0)
Based on the calculation formula to calculate the estimated value of the re-testpoint, here to give the K as a parameter, K is 1 when the algorithm degenerate into a normal linear regression. K the smaller the more accurate (too small may be over-fitted) the solution with the least squares to obtain the following formula:

lwlrTest(testArr,xArr,yArr,k=1.0)
Because the LWLR needs to specify every point, the whole loop is counted out here.
lwlrTestPlot(xArr,yArr,k=1.0)
Draw the result into an image

Results:

From numpy Import *
def loaddataset (fileName):
Numfeat = Len (open (FileName). ReadLine (). Split (' \ t '))-1
Datamat = []; Labelmat = []
FR = Open (fileName)
For line in Fr.readlines ():
Linearr =[]
CurLine = Line.strip (). Split (' \ t ')
For I in Range (numfeat):
Linearr.append (float (curline[i]))
Datamat.append (Linearr)
Labelmat.append (float (curline[-1]))
Return Datamat,labelmat
def standregres (Xarr,yarr):
Xmat = Mat (Xarr)
Ymat = Mat (Yarr). T
xTx = xmat.t * Xmat
If Linalg.det (xTx) = = 0.0:
print ' This matrix are singular, cannot do inverse '
Return
WS = XTX.I * (XMAT.T * ymat)
return ws
def plotstandregres (XARR,YARR,WS):
Import Matplotlib.pyplot as Plt
Fig = Plt.figure ()
Ax = Fig.add_subplot (111)
Ax.plot ([i[1] for I in Xarr],yarr, ' ro ')
XCopy = Xarr
Print type (xCopy)
Xcopy.sort ()
Yhat = Xcopy*ws
Ax.plot ([i[1] for i in Xcopy],yhat)
Plt.show ()
def calccorrcoef (XARR,YARR,WS):
Xmat = Mat (Xarr)
Ymat = Mat (Yarr)
Yhat = Xmat*ws
Return Corrcoef (yhat.t, Ymat)
def LWLR (testpoint,xarr,yarr,k=1.0):
Xmat = Mat (Xarr); Ymat = Mat (Yarr). T
m = shape (Xmat) [0]
weights = Mat (eye ((m)))
For j in Range (m):
Diffmat = Testpoint-xmat[j,:]
WEIGHTS[J,J] = exp (diffmat*diffmat.t/( -2.0*k**2))
XTx = xmat.t * (weights * xmat)
If Linalg.det (xTx) = = 0.0:
Print "This matrix are singular, cannot do inverse"
Return
WS = XTX.I * (XMAT.T * (weights * ymat))
return testpoint * ws
def lwlrtest (testarr,xarr,yarr,k=1.0):
m = shape (Testarr) [0]
Yhat = zeros (m)
For I in range (m):
Yhat[i] = LWLR (testarr[i],xarr,yarr,k)
Return Yhat
def lwlrtestplot (xarr,yarr,k=1.0):
Import Matplotlib.pyplot as Plt
Yhat = Zeros (Shape (Yarr))
XCopy = Mat (Xarr)
Xcopy.sort (0)
For I in range (shape (Xarr) [0]):
Yhat[i] = LWLR (xcopy[i],xarr,yarr,k)
Fig = Plt.figure ()
Ax = Fig.add_subplot (111)
Ax.plot ([i[1] for I in Xarr],yarr, ' ro ')
Ax.plot (Xcopy,yhat)
Plt.show ()
#return yhat,xcopy
def rsserror (Yarr,yhatarr): #yArr and Yhatarr both need to be arrays
Return ((Yarr-yhatarr) **2). SUM ()
def main ():
#regression
Xarr,yarr = Loaddataset (' ex0.txt ')
WS = Standregres (Xarr,yarr)
Print ws
#plotStandRegres (XARR,YARR,WS)
Print Calccorrcoef (XARR,YARR,WS)
#lwlr
Lwlrtestplot (xarr,yarr,k=1)
if __name__ = = ' __main__ ':
Main ()

(3) Based on BP neural network and genetic algorithm, as well as the Markov model of the actual combat stocks competition

For example, now with the BP, SVM, hmm and other algorithms to do a lot of programming people

Http://www.cnblogs.com/MrLJC/p/4147697.html

Http://www.cnblogs.com/qq-star/p/4148138.html

Linear regression---least squares and linear regression of pattern recognition

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linear regression---least squares and linear regression of pattern recognition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linear regression---least squares and linear regression of pattern recognition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support