Implementation of Python3 Partial least squares method

Source: Internet
Author: User

There are partial least squares in the Sklearn Library of Python3.

You can see the following library description: http://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html

Program Source: Source: 78467689 modified to run on Python3

My Data a.csv

My Data ron.csv

#-*-coding:utf-8-*-#It seems that this program is suitable for the Python2 version. Source: 78467689#This can now be used, mainly in the 60th row of data conversion,,, not very clear how to do the reload. Mainly because the data type is not clear, so use the most stupid method, do not know the result is not correct. "This is wrong."#PLSR3 after groping, the 68th line, or because the data structure type is not correct, then the array type arrays into the list,, so convenient to set a [] outside, and then run no problem, but the results are not valid, still need to be verified. #Import part of a libraryImportCSV fromSklearnImportpreprocessing fromSklearn.cross_validationImportTrain_test_split fromSklearn.decompositionImportRANDOMIZEDPCA fromSklearn.cross_decompositionImportPlsregression#The implementation of partial least squares, where it is possible to jump into the plsregression insideImportNumPy as NPImportMathImportMatplotlib.pyplot as Plt#part of importing dataA = Np.loadtxt ('A.csv', delimiter=',')#read into the data here is a is the matrix of y #读了这么多的数据??? Which is x, and which is Y??? Print(a.shape) RON= Np.loadtxt ('Ron.csv', delimiter=',')#here Ron is the matrix of X.Print(Ron.shape)#let's set the global variables here. X_train_st_i=[]#defines the part of a global function. defError (Y_predict,y_test):#define the squared sum function of the calculation error, and the incoming is the estimated value, and the test value, here is only used to define, convenient for subsequent calls. Errs = []     forIinchRange (len (y_predict)): E= (Y_predict[i]-y_test[i]) **2Errs.append (e)returnsum (errs)#Partial least squares implementation. X_train, X_test, y_train, y_test = Train_test_split (ron,a,test_size=0.5)#Divide the training set test set,, this is a library function?? ,,,, can be seen in a and Ron modeling. X_train_st = Preprocessing.scale (X_train)#data Normalization, this is a built-in functionY_train_st = Preprocessing.scale (Y_train)#data standardization, this sentence is I rewrite the above sentence. n_components = 0#This should be the number of main components to save.   whileN_components<x_train_st.shape[1]: n_components=n_components+1#At the first pass, the n_components is 1, the second loop is 2, and the nth cycle is n, the maximum is the number of X columns , that is, the number of features,PLS2 = Plsregression (n_components=n_components)#Calculate SS (ss This is full modeling, press is minus one to model,,, modeling in Python is simple, set parameters, call function to model)                                                                               #is this not partial least squares??? , here is the number of main components of the loop calculation, until the satisfactory accuracy is achieved. Pls2.fit (X_train_st, Y_train)#Fit is also a function,, two parameters, the first parameter is the training set, the second parameter is the target. Y_predict0 = Pls2.predict (X_train_st)#predict is also a built-in function,,, this is not used to build a model to do predictions,,, the parameter training set input into, get the predicted value. SS = Error (Y_predict0,y_train)#here is the error size between the predicted value and the true value. Y_predict1 = []#This is the creation of a new variable. According to the meaning of the name, according to the model of the Y predicted value, in fact, this model is left one method to establish the model.      forIinchRange (X_train_st.shape[0]):#calculate Press,,, this is the number of lines of X_train_stN_components1 = n_components#but I do not understand why this is added 1, the main ingredient cannot be 0, so it starts from 1. X_train_st1 = Np.delete (x_train_st,i,0)#here the 0 is the row, 1 is the column, and this should be the delete line I,,, here is the normalized array. The realization of leaving one lawY_train_st1 = Np.delete (y_train,i,0)#This is also the deletion of line I, which is standardized (but this x is standardized, y does not use standardized data). , this has not been used, is it wrong to write here?? PLS2 = Plsregression (n_components=n_components1)#The partial least squares parameter is set, and there are generally 5 parameters, but only the number of principal components is passed in here.         #parameter 1:n_components:int, (default 2), the number of principal components to retain, default to 2        #parameter 2:scale:boolean, (default True), whether the data is normalized, is assumed to be        #parameter 3:max_iter:an integer, (default 500), maximum number of iterations when using Nipals        #parameter 4:tol:non-negative Real (default 1e-06), Iteration cutoff condition        #parameter 5:copy:boolean, (default True) ,Pls2.fit (X_train_st1, Y_train_st1)#here is based on the previous set of parameters modeling process, the modeling process here is not very good (here x is normalized, y is not normalized), it should be used to normalization of the line ah.                                 #The main thing here is the conversion of the data format, because the matrix "format is very important" to make predictions .X_train_st_i=[]#It is important that you empty it before using it. Empty before using one parameter. x_train_st_list=x_train_st[i].tolist () x_train_st_i.append (x_train_st_list)Print('The X_train_st is', x_train_st_i)#output the variables to see if the format is correct, because the following predict function requires data in this format [[1,1,1,1,1,1]]y_predict11= Pls2.predict (x_train_st_i)#A predictive function, given a previously unused sample, is modeled to predict the corresponding Y-value.????           But it's been removed, isn't it??? Zhe this sentence a bit of a problem???? Is the data format is problematic, need to be on the outermost side of the add a []Y_predict1.append (y_predict11) #Make an array of all Y values to make it easier to calculate errors. , this is also the predicted value of Y, which is used to calculate another error. Press = error (Y_predict1,y_train)#possible error: 78861578Qh = 1-float (press/SS)ifqh<0.0985:#The precision model achieves the accuracy requirement and can stop the extraction of principal components. Plt.figure (1) Plt.scatter (y_predict0,y_train)#Draw a graph, this graph is the forecast value, and the figure of the measured value??? Plt.figure (2) Plt.scatter (y_predict1,y_train)Print('The Qh is', Qh)Print(' The Press is', press)Print('The SS is', SS) Break                                                                  #By reaching the above accuracy, you can stop the while iteration.                                                                                               #This is not understood below.  Print('n_components is', n_components+1)#Why do you want to add 1??? ,,, because the count is starting from 0?? SECs =[]errors=[]e= 100 forIinchRange (10):#Cyclic testing    #Print IX_train, X_test, y_train, y_test = Train_test_split (Ron,a, test_size=0.5)#divide the training set with the test set, which is a built-in function. X_test_st = Preprocessing.scale (x_test)#Data NormalizationY_predict = Pls2.predict (X_test_st)#Make predictionsSecs.append (NP.SQRT (Error (Y_predict,y_test)/(y_test.shape[0]-1)) Errors.append (Float (Error (y_predict,y_test) ))ifsecs[-1]<E:y_predict_min=y_predict y_test_min=y_testPrint('The prediced value is', Y_predict. T#drawing, printing resultsPrint('The true Value is', Y_test)Print('The mean error is', Float (np.mean (Errors)))Print("The mean SEC is", Float (Np.mean (SECs))) Plt.figure (3) Plt.scatter (y_predict_min,y_test_min)

From

Implementation of Python3 Partial least squares method

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.