Digit recognizer by LIGHTGBM

Source: Internet
Author: User
Tags comments xgboost

With LIGHTGBM and Xgboost respectively made the kaggle digit recognizer, try to use GRIDSEARCHCV tune the next parameter, mainly to Max_depth, Learning_rate, N_ Estimates and other parameters to debug, finally in 0.9747.

Capacity is limited, and next we don't know how to further adjust the parameters.


In addition, the Xgboost GRIDSEARCHCV will not be used, if there is a great God will, please inform.

Paste the LIGHTGBM code:

#!/usr/bin/python Import NumPy as NP import pandas as PD import LIGHTGBM as LGB from sklearn.metrics import mean_squared_e Rror from sklearn.model_selection import train_test_split, GRIDSEARCHCV # Specify your configurations as a dict params =
    {' Task ': ' Train ', ' boosting_type ': ' GBDT ', ' objective ': ' Multiclass ', ' num_class ': Ten, ' verbose ': 0, ' Metric ': ' Multi_logloss ', ' max_bin ': 255, ' max_depth ': 7, ' learning_rate ': 0.3, ' Nthread ': 4, ' N_ Estimators ':, # ' feature_fraction ': 0.8} def train_model (model_file= ' MODEL/LGB '): print "Load data ..." D
    Ataset = Pd.read_csv ("Data/train.csv", header=0) d_x = dataset.iloc[:, 1:].values d_y = dataset.iloc[:, 0].values  Train_x, test_x, train_y, test_y = Train_test_split (d_x, d_y, test_size=0.33, random_state=42) lgb_train = LGB. Dataset (train_x, label=train_y) Lgb_eval = LGB. Dataset (test_x, label=test_y, reference=lgb_train) print "Begin train ..." BST = Lgb.train (params, Lgb_train, Valid_sets=[lgb_eval], num_boost_round=160, early_s topping_rounds=10) print "Train end\nsaving ..." Bst.save_model (Model_file) return BST def create_submission (
    ): # get model BST = Train_model () # load test Data TEST_DF = Pd.read_csv ("Data/test.csv", header=0)
    Xg_test = test_df.iloc[:,:].values print "predicting ..." pred = bst.predict (xg_test) print "predict end." # Create CSV file print "Create submission file ..." pred = map (lambda x:sum ([i * round (y) for I, Y in enumerate (x)]), pred) submission = PD. DataFrame ({' ImageId ': Range (1, len (pred) + 1), ' Label ': [Int (x) for x in Pred]}) #submission. to_ 
        CSV ("Submission.csv", Index=false) np.savetxt (' Submission.csv ', Np.c_[range (1, Len (pred) + 1), pred], Delimiter= ', ', header= ' Imageid,label ', comments= ', fmt= '%d ') print "----end----"Def tune_model (): print" Load data ... "DataSet = Pd.read_csv (" Data/train.csv ", header=0) d_x = DataSet . iloc[:, 1:].values d_y = dataset.iloc[:, 0].values print "Create classifier ..." Param_grid = {# "Reg
        _alpha ": [0.3, 0.7, 0.9, 1.1]," learning_rate ": [0.1, 0.25, 0.3], ' n_estimators ': [75, 80, 85, 90],
        ' Max_depth ': [6, 7, 8, 9]} params = {' objective ': ' multiclass ', ' metric ': ' Multi_logloss ', ' Max_bin ': 255, ' max_depth ': 7, ' learning_rate ': 0.25, ' n_estimators ': $,} # Max_de PTH = 7, learning_rate:0.25 model = LGB. Lgbmclassifier (boosting_type= ' GBDT ', objective= "Multiclass", Nthread=8, seed=42) model.n_classes = ten pri NT "Run Grid search ..." searcher = GRIDSEARCHCV (Estimator=model, Param_grid=param_grid, cv=3) searcher.fit (d_x, D_ Y) Print searcher.grid_scores_ print "=" *, ' \ n ' Print searcher.best_params_ print"=" *, ' \ n ' Print searcher.best_score_ print "End" if __name__ = = "__main__": #create_submission () tu
 Ne_model ()


In addition, the Xgboost code:

#-*-Coding:utf-8-*-#!/usr/bin/python Import codecs import OS import time import pandas as PD import Xgboost as XGB F Rom sklearn.model_selection import GRIDSEARCHCV import NumPy as NP from Sklearn import metrics import Sklearn.preprocessin  G as SP params = {"Objective": "Multi:softmax", "eta": 0.25, ' max_depth ': 7, ' silent ': 1, ' nthread ': 4, ' Num_class ': Ten,} def train_model (): print "Load data ..." DataSet = Pd.read_csv ("Data/train.csv", head er=0) train_x = dataset.iloc[:, 1:].values train_y = dataset.iloc[:, 0].values xg_train = XGB. Dmatrix (train_x, label=train_y) print "Begin train ..." BST = Xgb.train (params, xg_train, ten) print "Train end \nsaving ... "Bst.save_model (" Model/bst ") return BST def create_submission (): TEST_DF = Pd.read_csv (" Data/tes T.csv ", header=0) xg_test = XGB.
    Dmatrix (test_df.iloc[:,:].values) BST = Train_model () print "predicting ..." pred = Bst.predict (xg_test) PriNT "predict end." # Create CSV file print "Create submission file ..." submission = PD. DataFrame ({' ImageId ': Range (1, len (pred) + 1), ' Label ': [Int (x) for x in Pred]}) #submission. to_ 
        CSV ("Submission.csv", Index=false) np.savetxt (' Submission.csv ', Np.c_[range (1, Len (pred) + 1), pred], Delimiter= ', ', header= ' Imageid,label ', comments= ', fmt= '%d ') print "----End----" D EF tune_parameters (): print "Load data ..." DataSet = Pd.read_csv ("Data/train.csv", header=0) train_x = Datase t.iloc[:100, 1:].values train_y = dataset.iloc[:100,: 1].values xg_train = XGB.  Dmatrix (train_x, label=train_y) Param_grid = {' Learning_rate ': [0.1, 0.4]} print "Create classifier ..." model = XGB. Xgbclassifier (max_depth=6, learning_rate=0.1, n_estimators=10, Silent=true, objec Tive= "Multi:softmax", seed=36, nthread=8) searcher = GriDSEARCHCV (Estimator=model, Param_grid=param_grid, scoring= ' Roc_auc ', cv=3) #train_Y = [sum (x) for x in Train_ Y] train_y = Sp.label_binarize (train_y, Classes=range (0, ten)) #print Train_y.shape, Train_x.shape #print train _y[66, 9] print "Fitting ..." Searcher.fit (train_x, train_y) print Searcher.grid_scores_, searcher.best_params
 _, searcher.best_score_ print "End ..." if __name__ = = "__main__": Tune_parameters ()








Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.