Theory and formula please see Andrew Ng's machine learning in the NetEase public class, or the machine learning of Andrew Ng in Coursera
For multivariate linear regression to fit the best straight line, in order to make the error squared and the smallest, the method on the textbook is to seek the bias, and make it 0, and then solve the linear equation group.
But there are many other ways to achieve this, and Andrew gives two methods of gradient descent in the case of large samples and small samples. I've achieved his first kind here.
The step parameter is selected as 0.03 and the initial value is selected as 0 0. The dataset uses the SAS built-in Sashelp.class dataset.
Because the variable units are different, they are processed in scaing feature step.
The results of the validation are consistent with the results obtained by the SAS reg process.
options Fullstimer;procIML; Reset Deflib=Sashelp; Useclass; Read All var{Weight Height} intome; M=nrow (Me); * scaling feature; S1= Max(Me[, 1])-min(Me[, 1]); S2= Max(Me[, 2])-min(Me[, 2]); MEAN_S1=Mean (Me[, 1]); Mean_s2=Mean (Me[, 2]); Me[, 1] =(Me[, 1]-MEAN_S1)/S1; Me[, 2] =(Me[, 2]-MEAN_S2)/S2; *scaling feature; *Printme; Theta_0= 0; Theta_1= 0; X0= 1; OV= Ten; Alpha= 0.03; *Printme; Rec= 0;
do while(OV>0.000000001); Theta_0old=Theta_0; Theta_1old=theta_1; *compute old residual and collect data to plot r* numofiteration; Rec=Rec+ 1; R2= 0; Do I=1 tom; Residual_tt=(Theta_0old*X0+Theta_1old*Me[i,2])-Me[i,1]; R2=R2+Residual_tt*Residual_tt; End; Jtheta=R2/2/m; XY=Xy//(REC||Jtheta); *compute old residual and collect data to plot r* numofiteration; Res= 0; Res_1= 0; Do I=1 tom; Residual_0=(Theta_0old*X0+Theta_1old*Me[i,2])-Me[i,1]; Res=Res+(Residual_0*x0); Res_1=Res_1+(Residual_0*Me[i,2]); End; *PrintResidual_0; Theta_0=Theta_0old-Alpha*Res/m; Theta_1=Theta_1old-Alpha*Res_1/m; *update residual and decide whether it 's convergence; r2 = 0; Do I=1 to M; Residual_tt = (theta_0*x0 + theta_1*me[i,2])-me[i,1]; r2 = R2+residual_tt*residual_tt; End Jtheta_new = r2/2/m; OV = ABS (Jtheta_new-jtheta); *update residual and decide whether it's convergence; End; Printov; Call Pgraf (XY,'*','x','y','mmmmm'); Theta_0_last=Theta_0*S1+Mean_s1-Mean_s2*S1*Theta_1/S2; Theta_1_last=Theta_1*S1/S2; PrintTheta_0_last Theta_1_last; Run;quit;
Implementation of SAS code for parameter estimation---Gradient descent method for least squares parameters