Examples of random forest samples and classification targets
Attention:
1. Target category is more than 3 (only two logical categories)
2. Self-variable x in unit of behavior
3. Dependent variable y is listed as unit (each value corresponds to a row of x)
4. Other No, give it to the program.
#-*-coding:utf-8-*-"""Created on Tue 17:40:04 2016@author:administrator"""#-*-coding:utf-8-*-"""Created on Tue 16:15:03 2016@author:administrator"""#Random Forest DemoImportNumPy as NPImportPandas as PD fromSklearn.ensembleImportRandomforestclassifier#From sklearn.tree import decisiontreeclassifier fromSklearn.cross_validationImportTrain_test_split fromSklearn.metricsImportClassification_report fromSklearn.pipelineImportPipeline fromSklearn.grid_searchImportGRIDSEARCHCVif __name__=='__main__': " "df = pd.read_csv (' Ad.data ', header=none) Explanatory_variable_columns = set (df.columns.values) Response_vari Able_column = Df[len (df.columns.values)-1] # The last column describes the targets Explanatory_variable_columns.remo ve (Len (df.columns.values)-1) y = [1 if E = = ' ad. ' Else 0 for E in response_variable_column] X = df[list (explanatory_ Variable_columns)] X.replace (to_replace= ' *\? ', Value=-1, Regex=true, inplace=true)" "X=Np.array ([[0,0,0,0], [0,0,0,1], [0,0,1, 0], [0,0,The], [0,1, 0,0], [0,1,0,1], [0,The, 0], [0,1,1,1], [1, 0,0,0], [1,0,0,1], [1,0,1, 0], [1,0,1,1], [The, 0,0], [1,1,0,1], [1,1,1, 0], [1,1,1,1]]) y= Np.array ([0,1,1,0,2,1,0,0,0,2,1,0,2,1,0,0])#It's going to be a line vector (if it's multiple lines, it will be an error)X_train, X_test, Y_train, Y_test=Train_test_split (X, y) pipeline=Pipeline ([('CLF', Randomforestclassifier (criterion='Entropy') ]) parameters= { 'clf__n_estimators': (5, 10, 20, 50), 'clf__max_depth': (50, 150, 250), 'Clf__min_samples_split': (1, 2, 3), 'Clf__min_samples_leaf': (1, 2, 3)} Grid_search= GRIDSEARCHCV (pipeline, parameters, n_jobs=-1,verbose=1, scoring='F1') Grid_search.fit (X_train, Y_train)Print 'Best Score:%0.3f'%grid_search.best_score_Print 'Best parameters Set:'best_parameters=Grid_search.best_estimator_.get_params () forParam_nameinchSorted (Parameters.keys ()):Print '\t%s:%r'%(Param_name, best_parameters[param_name]) predictions=grid_search.predict (x_test)PrintClassification_report (y_test, predictions)
Examples of random forest samples and classification targets