The code topic of job four is mainly based on ridge regression, and adds a variety of cross-validation situations.
Since ridge regression is analytic solution, the direct inverse matrix is OK, the process is not complicated. It is only when you are doing cross-validation that you meet some problems.
#Encoding=utf8ImportSYSImportNumPy as NPImportMath fromRandomImport*#read input data (train or test)defread_input_data (path): x=[] y= [] forLineinchopen (Path). ReadLines (): Items= Line.strip (). Split (' ') tmp_x= [] forIinchRange (0,len (items)-1): Tmp_x.append (float (items[i)) x.append (tmp_x) y.append (float (items[-1])) returnNp.array (x), Np.array (y)defcalculate_w_rigde_regression (X,Y,LAMBDA): Z_v= NP.LINALG.INV (Np.dot (X.transpose (), x) +lambda*np.eye (x.shape[1])) returnNp.dot (Np.dot (Z_v,x.transpose ()), y)#test ResultdefCalculate_e (w, x, y): Scores=Np.dot (W, X.transpose ()) predicts= Np.where (scores>=0,1.0,-1.0) Eout= SUM (predicts!=y)return(eout*1.0)/Predicts.shape[0]if __name__=='__main__': #prepare train and test dataX, y = Read_input_data ("Train.dat") x= Np.hstack ((Np.ones (x.shape[0]). Reshape ( -1,1), x) test_x,test_y= Read_input_data ("Test.dat") test_x= Np.hstack ((Np.ones (test_x.shape[0]). Reshape ( -1,1), test_x))#LambdaLambda_set = [i forIinchRange (2,-11,-1) ] ## q13~q15Min_ein = 1Min_eout= 1Target_lambda= 2 forLambdainchLambda_set:#Calculate Ridge regression WW = Calculate_w_rigde_regression (X,y,pow (10, LAMBDA)) Ein=calculate_e (W, x, y) eout=calculate_e (W, test_x, test_y)#update Ein eout Lambda ifeout<Min_eout:target_lambda=LAMBDA Min_ein=Ein min_eout=Eout#Print Min_ein #Print Min_eout #Print Target_lambda ## q16~q18Min_etrain = 1Min_eval= 1Min_eout= 1Target_lambda= 2Split= 120 forLambdainchLambda_set:#Calculate Ridge regression WW = Calculate_w_rigde_regression (X[:split], y[:split], pow (10, LAMBDA)) Etrain=calculate_e (W, X[:split], Y[:split]) Eval=calculate_e (W, X[split:], Y[split:]) eout=calculate_e (W, test_x, test_y)#update Ein eout Lambda ifeval<Min_eval:target_lambda=LAMBDA Min_etrain=Etrain Min_eval=Eval min_eout=Eout#Print Min_etrain #Print Min_eval #Print Min_eout #Print Target_lambdaW= Calculate_w_rigde_regression (X,y,pow (10, Target_lambda)) Optimal_ein=calculate_e (w,x,y) optimal_eout=calculate_e (w,test_x,test_y)#Print Optimal_ein #Print Optimal_eout ## q19~q20MIN_ECV = 1Target_lambda= 2V= 5V_range= [] forIinchRange (0,V): V_range.append ([i* (x.shape[0]/v), (i+1) * (x.shape[0]/V)]) forLambdainchLambda_set:total_ecv=0 forIinchRange (0,v):#train x, ytrain_x =[] train_y= [] forJinchRange (0,v):ifj!=i:train_x.extend (X[range (v_range[j][0],v_range[j][1])].tolist ()) Train_y.extend (Y[range (v_range[j][0],v_range[j][1])].tolist ()) train_x=Np.array (train_x) train_y=Np.array (train_y)#test x, ytest_x = X[range (v_range[i][0],v_range[i][1])] test_y= Y[range (v_range[i][0],v_range[i][1])] W= Calculate_w_rigde_regression (train_x, train_y, pow (10, LAMBDA)) ECV=calculate_e (W, test_x, test_y) total_ecv= Total_ecv +ECVPrint "Total ECV:"+Str (TOTAL_ECV)ifMin_ecv> (total_ecv*1.0)/V:min_ecv= (total_ecv*1.0)/V Target_lambda=LAMBDAPrintMin_ecvPrintTarget_lambda W= Calculate_w_rigde_regression (x, Y, pow (10, Target_lambda)) Ein=calculate_e (W, x, y) test_x,test_y= Read_input_data ("Test.dat") test_x= Np.hstack ((Np.ones (test_x.shape[0]). Reshape ( -1,1) , test_x) eout=calculate_e (W, test_x, test_y)PrintEinPrintEout
Here's a question:
When talking about linear regression, there are:
X ' x This matrix was then said to be reversible, or maybe not? But it must be a real symmetric array, what does it have to do with positive definite?
Here again the Matrix is Z ' Z is a semi-positive matrix, with the unit array must be positive definite? How did this come about?
"Job 4" heights Field machine learning Cornerstone