What is regression?
The word "regression" was invented by Darwin's cousin Francis Galton. Galton completed its first regression prediction in 1877 to predict the size of the next generation of pea seeds (children) based on the size of the previous generation of pea seeds (both parents).
Galton applied regression analysis to a large number of objects, even including human height prediction. He noted that if parents were taller than average, their children tended to be higher than the average height, but were less than their parents (I don't feel that Galton did not consider the development of material conditions would lead to an increase in overall height, at least I feel 80,90 people are generally higher than parents). The child's height is "back" toward the average height.
In the software testing methodology, there is a regression test, that is to say, to ensure that the original feature correct, not received the impact of changes.
Finding the best fit line with linear regression
The goal of regression is to predict the target value of the numerical type. The most straightforward approach is to obtain a regression equation (regression equation), which is known as the X-generation equation to get the predicted result y.
Say regression, usually refers to linear regression, linear regression means that the input can be multiplied by some constants, and then add up to get the output.
how to find the main step of the regression equation is to find the regression coefficient , once the regression coefficient, we can construct the regression equation.
Assuming that the input data is stored in the matrix X , and the regression coefficients are stored in the vector W, the predicted results are given by Y1=XT1W for the given data X1.
So, what if we find w through a well-known set of x and y relationships? A common method is to find the W that minimizes the error. The error here refers to the difference between the predicted Y value and the true Y value. The best w of the X, Y relationship collection can be derived by using the additive of this error. However, due to the accumulation will make the positive and negative difference between the value of each other, so in the calculation of the use of squared error:
A matrix representation can also be written (y x W) T (yxw). If the derivative of W, get XT (Y-XW), make it equal to 0, solve the W:
The small mark on the W indicates that this is the optimal solution for the w that can be estimated at present.
Note that the above formula contains (xTx)-1, which is the need to reverse the matrix. Therefore, this equation applies only if the inverse matrix exists.
In addition to the matrix method, there are many other ways to solve W, such as "Ordinary least squares".
To find the code for W:
defLoaddataset (FileName):#General function to parse tab-delimited floatsNumfeat = Len (open (FileName). ReadLine (). Split ('\ t'))-1#get number of fieldsDatamat = []; Labelmat =[] FR=Open (FileName) forLineinchfr.readlines (): Linearr=[] CurLine= Line.strip (). Split ('\ t') forIinchRange (numfeat): Linearr.append (float (curline[i)) datamat.append (Linearr) Labelmat.append (fl Oat (curline[-1])) returnDatamat,labelmatdefstandregres (Xarr,yarr): Xmat= Mat (Xarr); Ymat =Mat (Yarr). T xTx= xmat.t*XmatifLinalg.det (xTx) = = 0.0: #判断是否存在逆矩阵Print "This matrix is singular, cannot do inverse" returnws= xtx.i * (xmat.t*Ymat)returnWs
Local weighted linear regression
One problem with linear regression is the possibility of an under-fitting phenomenon, because it requires unbiased estimation with minimum mean square error. Obviously, if the model does not fit, it will not get the best prediction effect. So some methods allow for the introduction of some deviations in the estimation, thus reducing the mean square error of the prediction.
One of these methods is local weighted linear regression (locally Weighted Linear Regression, LWLR). In this algorithm, we give a certain weight to each point near the predicted point, and the normal regression is based on the minimum mean variance on this subset. Like KNN, the algorithm needs to take out the corresponding subset of data in advance of each prediction. The algorithm relieves the regression coefficient w in the following form:
where W is a matrix that assigns weights to each data point.
LWLR uses "cores" to give higher weights to nearby points. The type of nuclear is freely selectable, the most commonly used nucleus is the Gaussian nucleus, and the corresponding weights of the Gaussian nuclei are as follows:
The above formula contains a parameter k that needs to be specified by the user, which determines how much weight is given to nearby points.
Implementation code for the LWLR:
defLWLR (testpoint,xarr,yarr,k=1.0): Xmat= Mat (Xarr); Ymat =Mat (Yarr). T m=shape (Xmat) [0] weights=Mat (Eye ((M))) forJinchRange (m):#next 2 lines create weights matrixDiffmat = Testpoint-xmat[j,:]#WEIGHTS[J,J] = exp (diffmat*diffmat.t/( -2.0*k**2)) XTx= xmat.t * (weights *Xmat)ifLinalg.det (xTx) = = 0.0: Print "This matrix is singular, cannot do inverse" returnws= XTX.I * (XMAT.T * (Weights *Ymat)) returnTestPoint *wsdefLwlrtest (testarr,xarr,yarr,k=1.0):#Loops over all the data points and applies LWLR to each onem =shape (Testarr) [0] Yhat=zeros (m) forIinchRange (m): Yhat[i]=LWLR (testarr[i],xarr,yarr,k)returnYhat
Regression: Predicting numerical data