Python single-category predictive templates, output support, multiple classifiers, str csv-to-float

Last Update:2018-07-31 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The forecast results are from 1 to 11 of 1

Load data first, train data, train tags, predict data, predict tags:

if __name__= ="__main__":          importtraincontentdata ()    Importtestcontentdata ()    importtrainlabeldata ()    importtestlabeldata ( )

Traindata =[]testdata=[]trainlabel=[]testlabel= []defimporttraincontentdata (): File='F:/goverment/myfinalcode/train_big.csv'fo=open (file) LS=[]     forLineinchFo:line=line.replace ("\ t",",") Line=line.replace ("\ n",",") Line=line.replace ("\"",",") Ls.append (Line.split (","))     forIinchLs:li=[]         forJinchI:ifj = ="':                ContinueLi.append (Float (j)) Traindata.append (LI)defimporttestcontentdata (): File='F:/goverment/myfinalcode/test_big.csv'fo=open (file) LS=[]     forLineinchFo:line=line.replace ("\ t",",") Line=line.replace ("\ n",",") Line=line.replace ("\"",",") Ls.append (Line.split (","))     forIinchLs:li=[]         forJinchI:ifj = ="':                ContinueLi.append (Float (j)) Testdata.append (LI)#Import Training and test data for a categorydefimporttrainlabeldata (): File='F:/goverment/myfinalcode/train_big_label.xls'WB=xlrd.open_workbook (file) WS= Wb.sheet_by_name ("Sheet1")     forRinchRange (ws.nrows): Col= []         forCinchRange (1): Col.append (Ws.cell (R, c). Value) Trainlabel.append (col)defimporttestlabeldata (): File='F:/goverment/myfinalcode/test_big_label.xls'WB=xlrd.open_workbook (file) WS= Wb.sheet_by_name ("Sheet1")     forRinchRange (ws.nrows): Col= []         forCinchRange (1): Col.append (Ws.cell (R, c). Value) Testlabel.append (col)

The training data, the forecast data is the CSV file format, and is STR, to float and row into the LIS, and then put all the LIS into Traindata or testdata, but the CSV is "," separated, so to "\ T" and so on ",", need to use

Ls.append (Line.split (",")) put in LS, but still str type, I converted into a float, and later sent
It is also possible to convert now, maybe it will be converted later.

After the use of a variety of classifiers, tuning parameters Reference
Http://scikit-learn.org/stable/supervised_learning.html#supervised-learning
Then select the best possible classifier to improve the accuracy rate

 " "#19% from Sklearn import Neighbors knn=neighbors. Kneighborsclassifier (n_neighbors=75, leaf_size=51, weights= ' distance ', p=2) knn.fit (Traindata, Trainlabel) predict=k Nn.predict (testdata)" "    " "#这个不行 from sklearn.neural_network import mlpclassifier import numpy as NP Traindata = Np.array (traindata) #TypeError: Cannot perform reduce with flexible type Traindata = Traindata.astype (float) Trainlabel = Np.array (t Rainlabel) Trainlabel = Trainlabel.astype (float) testdata=np.array (testdata) testdata = Testdata.astype (float ) Model=mlpclassifier (activation= ' Relu ', alpha=1e-05, batch_size= ' auto ', beta_1=0.9, beta_2=0.999, early_stopping =false, epsilon=1e-08, hidden_layer_sizes= (5, 2), learning_rate= ' constant ', learning_rate_init=0.001, Max_iter =200, momentum=0.9, Nesterovs_momentum=true, power_t=0.5, random_state=1, Shuffle=true, solver= ' Lbfgs ', tol=0. 0001, validation_fraction=0.1, Verbose=false, Warm_start=false) model.fit (Traindata, trainlabel) predict = MoD El.predict (testdata)" "          " "#19% from sklearn.tree import decisiontreeclassifier model=decisiontreeclassifier (class_weight= ' balanced ', MA        X_features=68,splitter= ' best ', random_state=5) model.fit (Traindata, trainlabel) predict = Model.predict (testdata)      This doesn't work. From sklearn.naive_bayes import MULTINOMIALNB CLF = MULTINOMIALNB (alpha=0.052). Fit (Traindata, Trainlabel) #clf. Fit (Traindata, Trainlabel) predict=clf.predict (testdata)" "        " "17% from SKLEARN.SVM import svc CLF = svc (c=150,kernel= ' RBF ', degree=51, gamma= ' auto ', Coef0=0.0,shrinking=false, probability=false,tol=0.001,cache_size=300, Class_weight=none,verbose=false,max_iter=-1,decision_function_shape =none,random_state=none) Clf.fit (Traindata, Trainlabel) predict=clf.predict (testdata)" "        " "0.5% from Sklearn.naive_bayes import GAUSSIANNB import numpy as NP GNB = GAUSSIANNB () Traindata = Np.array (traindata) #TypeError: Cannot perform reduce with flexible type Traindata = Traindata.astype (float) Trainlabel = Np.array (trainlabel) Trainlabel = Trainlabel.astype (float) testdata=np.array (testdata) testdata = Testdata.astype (float) predict = Gnb.fit (Traindata, Trainlabel). Predict (TestData)" "        " "16% from Sklearn.naive_bayes import bernoullinb import numpy as NP GNB = BERNOULLINB () Traindata = Np.arra Y (traindata) #TypeError: Cannot perform reduce with flexible type Traindata = Traindata.astype (float) Trainlabel = Np.array (trainlabel) Trainlabel = Trainlabel.astype (float) testdata=np.array (testdata) testdata = testdata. Astype (float) predict = Gnb.fit (Traindata, Trainlabel). Predict (TestData)" "         fromSklearn.ensembleImportRandomforestclassifier Forest= Randomforestclassifier (n_estimators=500,random_state=5, Warm_start=false, Min_impurity_decrease=0.0,min_samples _SPLIT=15)#generate random Forest multi-classifierpredict= Forest.fit (Traindata, Trainlabel). Predict (TestData)

Output accuracy, I also output the forecast to TXT, convenient analysis.

s=Len (predict) F=open ('F:/goverment/myfinalcode/predict.txt','W')     forIinchRange (s): F.write (str (predict[i))) F.write ('\ n') F.write ("it's all written.") F.close () K=0Print(s) forIinchRange (s):ifTestlabel[i] = =Predict[i]: K=k+1Print("The accuracy is:", k*1.0/s)

The next step is to output the support of all labels

    Print('I'm going to start outputting the support level.') Attribute_proba=Forest.predict_proba (testdata)#Print (Forest.predict_proba (testdata)) #输出各个标签的概率    Print(Type (attribute_proba))ImportXLWT Myexcel=XLWT. Workbook () sheet= Myexcel.add_sheet ('sheet') Si=-1SJ=-1 forIinchAttribute_proba:si=si+1 forJinchI:SJ=sj+1Sheet.write (Si,sj,str (j)) SJ=-1Myexcel.save ("Attribute_proba_small.xls")

The results of the operation are as follows:

But that's not enough, and I'm going to output the number and support of the first 3 predictions.
I opened a class Attri,key used to put the number, weight to put the support degree.
All predicted probabilities (support degrees) for each record are then traversed 3 times. Each time you find the one with the greatest probability, pick out the number and
The probability is stored well, and the value is changed to 0, then the largest one is searched, and the loop is 3 times. Save well and output to Excel

    " "Next, output the number of the four with the largest probability of each group" "    classAttri:def __init__(self): Self.key=0 Self.weight=0.0label=[]     forIinchAttribute_proba:lis=[] k=0 whileK<3: K=k+1P=1mm=0 SJ=-1 forJinchI:SJ=sj+1ifJ>mm:mm=J P=SJ I[p]=0#is it starting from 1? I wrote I "P-1" at first, but I found it wrong when I debug.A=Attri () A.key=P a.weight=mm Lis.append (a) label.append (LIS)Print('pick a few outputs')     ImportXLWT Myexcel=XLWT. Workbook () sheet= Myexcel.add_sheet ('sheet') Si=-2SJ=-1 forIinchLabel:si=si+2 forJinchI:SJ=sj+1Sheet.write (Si,sj,str (J.key)) sheet.write (Si+1, Sj,str (j.weight)) SJ=-1Myexcel.save ("Proba_big.xls")

The results of the operation are as follows:

Self-study really hard ah, these are my learning results, accurate or can be improved, for you to help, point a praise it, hey.

Python single-category predictive templates, output support, multiple classifiers, str csv-to-float

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python single-category predictive templates, output support, multiple classifiers, str csv-to-float

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python single-category predictive templates, output support, multiple classifiers, str csv-to-float

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support