First, installation
Pip Install hyperopt
Ii. description
The hyperopt provides an optimized interface that accepts an evaluation function and parameter space to calculate the loss function value of a point within the parameter space. The user also specifies the distribution of the parameters within the space.
Hyheropt Four Important factors: Specify the function that needs to be minimized, the space to search, the sampled data set (Trails database) (optional), the algorithm for searching (optional).
First, define a target function, accept a variable, and then return the loss value of a function, such as minimizing the function q (x, y) = x**2 + y**2
Specifies the search algorithm, which is the value of the Algo parameter of the hyperopt fmin function. Currently supported algorithms are randomly searched (corresponding to hyperopt.rand.suggest), simulated annealing (corresponding to hyperopt.anneal.suggest), TPE algorithm.
For parameter space settings, such as optimization function q, input fmin (Q,space=hp.uniform (' a ', 0,1)). The first parameter of the Hp.uniform function is the label, and each parameter must have a unique label in the argument space. HP.UNIFORM specifies the distribution of the parameters. Other parameter distributions, such as
Hp.choice Returns an option that can be a list or tuple.options can be a nested expression that is used to form a conditional parameter.
Hp.pchoice (label,p_options) returns an option for p_options with a certain probability. This option allows the function to be uneven in the search process for each option.
The Hp.uniform (label,low,high) parameter is evenly distributed between low and high.
Hp.quniform (LABEL,LOW,HIGH,Q), the value of the parameter is round (uniform (low,high)/q) *q, applicable to those discrete values.
Hp.loguniform (Label,low,high) draws exp (Uniform (Low,high)), the value range of the variable is [exp (low), exp (high)]
Hp.randint (label,upper) returns a random integer in the interval that is closed before [0,upper].
Search spaces can contain lists and dictionary.
from hyperopt Import hplist_space = [Hp.uniform (' a ', 0, 1), Hp.loguniform (' B ', 0, 1)]tuple_space = (Hp.uniform (' a ', 0 , 1), Hp.loguniform (' B ', 0, Span class= "Hljs-number" >1)) Dict_space = {' A ': Hp.uniform (' a ', 0, 1), ' B ': Hp.loguniform (' B ' , 0, 1)}
Third, simple example
from Import Hp,fmin, Rand, TPE, Space_evaldef Q (args): = args return x**2-2*x+1 + y**2= [Hp.randint ('x', 5), Hp.randint (' y ', 5= fmin (q,space,algo=rand.suggest,max_evals=10)print(best)
Output:
{' X ': 2, ' Y ': 0}
Iv. Examples of Xgboost
Xgboost has a lot of parameters, the Xgboost code is written into a function, and then passed into the fmin for parameter optimization, cross-validated AUC as the optimization goal. The larger the AUC, the better, because fmin is the minimum value, so the minimum value of-AUC is obtained. The dataset used is a 202-column dataset, the first column is the sample ID, the last column is the label, and the Middle 200 column is the property.
#Coding:utf-8ImportNumPy as NPImportPandas as PD fromSklearn.preprocessingImportMinmaxscalerImportXgboost as XGB fromRandomImportShuffle fromXgboost.sklearnImportXgbclassifier fromSklearn.cross_validationImportCross_val_scoreImportPickleImport Time fromHyperoptImportfmin, TPE, Hp,space_eval,rand,trials,partial,status_okdefLoadFile (FileName ="E://zalei//browsetop200pca.csv"): Data= Pd.read_csv (filename,header=None) Data=data.valuesreturnDatadata=loadFile () label= Data[:,-1]attrs= Data[:,:-1]labels= Label.reshape ((1,-1)) Label=labels.tolist () [0]minmaxscaler=Minmaxscaler () attrs=minmaxscaler.fit_transform (attrs) Index=Range (0,len (label)) Shuffle (index) Trainindex= Index[:int (len (label) *0.7)]PrintLen (trainindex) Testindex= Index[int (len (label) *0.7):]PrintLen (testindex) Attr_train=Attrs[trainindex,:]Printattr_train.shapeattr_test=Attrs[testindex,:]PrintAttr_test.shapelabel_train=labels[:,trainindex].tolist () [0]PrintLen (label_train) label_test=labels[:,testindex].tolist () [0]PrintLen (label_test)PrintNp.mat (Label_train). Reshape (( -1,1). ShapedefGBM (argsdict): Max_depth= argsdict["max_depth"] + 5n_estimators= argsdict['n_estimators'] * 5 + 50learning_rate= argsdict["learning_rate"] * 0.02 + 0.05subsample= argsdict["subsample"] * 0.1 + 0.7Min_child_weight= argsdict["Min_child_weight"]+1Print "max_depth:"+Str (max_depth)Print "N_estimator:"+Str (n_estimators)Print "learning_rate:"+Str (learning_rate)Print "subsample:"+Str (subsample)Print "Min_child_weight:"+Str (min_child_weight)GlobalAttr_train,label_train GBM= XGB. Xgbclassifier (nthread=4,#Number of processesMax_depth=max_depth,#Maximum DepthN_estimators=n_estimators,#the number of treesLearning_rate=learning_rate,#Learning RateSubsample=subsample,#Number of SamplesMin_child_weight=min_child_weight,#Number of childrenMax_delta_step = 10,#10 steps without descending, then stop .Objective="binary:logistic") Metric= Cross_val_score (gbm,attr_train,label_train,cv=5,scoring="ROC_AUC"). Mean ()PrintMetricreturn-Metricspace= {"max_depth": Hp.randint ("max_depth", 15), "n_estimators": Hp.randint ("n_estimators", 10),#[0,1,2,3,4,5], [a] "learning_rate": Hp.randint ("learning_rate", 6),#[0,1,2,3,4,5], 0.05,0.06 "subsample": Hp.randint ("subsample", 4),#[0,1,2,3], [0.7,0.8,0.9,1.0] "Min_child_weight": Hp.randint ("Min_child_weight", 5),#}algo= Partial (Tpe.suggest,n_startup_jobs=1) Best= Fmin (gbm,space,algo=algo,max_evals=4)Print BestPrintGBM (BEST)
Detailed reference: http://blog.csdn.net/qq_34139222/article/details/60322995
Python assistant Artifact Hyperopt