Python assistant Artifact Hyperopt

Last Update:2017-07-12 Source: Internet

Author: User

Tags shuffle xgboost

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, installation

Pip Install hyperopt

Ii. description

The hyperopt provides an optimized interface that accepts an evaluation function and parameter space to calculate the loss function value of a point within the parameter space. The user also specifies the distribution of the parameters within the space.
Hyheropt Four Important factors: Specify the function that needs to be minimized, the space to search, the sampled data set (Trails database) (optional), the algorithm for searching (optional).
First, define a target function, accept a variable, and then return the loss value of a function, such as minimizing the function q (x, y) = x**2 + y**2

Specifies the search algorithm, which is the value of the Algo parameter of the hyperopt fmin function. Currently supported algorithms are randomly searched (corresponding to hyperopt.rand.suggest), simulated annealing (corresponding to hyperopt.anneal.suggest), TPE algorithm.

For parameter space settings, such as optimization function q, input fmin (Q,space=hp.uniform (' a ', 0,1)). The first parameter of the Hp.uniform function is the label, and each parameter must have a unique label in the argument space. HP.UNIFORM specifies the distribution of the parameters. Other parameter distributions, such as

Hp.choice Returns an option that can be a list or tuple.options can be a nested expression that is used to form a conditional parameter.
Hp.pchoice (label,p_options) returns an option for p_options with a certain probability. This option allows the function to be uneven in the search process for each option.
The Hp.uniform (label,low,high) parameter is evenly distributed between low and high.
Hp.quniform (LABEL,LOW,HIGH,Q), the value of the parameter is round (uniform (low,high)/q) *q, applicable to those discrete values.
Hp.loguniform (Label,low,high) draws exp (Uniform (Low,high)), the value range of the variable is [exp (low), exp (high)]
Hp.randint (label,upper) returns a random integer in the interval that is closed before [0,upper].

Search spaces can contain lists and dictionary.

from hyperopt Import hplist_space = [Hp.uniform (' a ', 0, 1), Hp.loguniform (' B ', 0, 1)]tuple_space = (Hp.uniform (' a ', 0 , 1), Hp.loguniform (' B ', 0, Span class= "Hljs-number" >1)) Dict_space = {' A ': Hp.uniform (' a ', 0, 1), ' B ': Hp.loguniform (' B ' , 0, 1)}

Third, simple example

 from Import   Hp,fmin, Rand, TPE, Space_evaldef  Q (args):    = args    return x**2-2*x+1 + y**2= [Hp.randint ('x', 5), Hp.randint (' y ', 5= fmin (q,space,algo=rand.suggest,max_evals=10)print(best)

Output:

{' X ': 2, ' Y ': 0}

Iv. Examples of Xgboost

Xgboost has a lot of parameters, the Xgboost code is written into a function, and then passed into the fmin for parameter optimization, cross-validated AUC as the optimization goal. The larger the AUC, the better, because fmin is the minimum value, so the minimum value of-AUC is obtained. The dataset used is a 202-column dataset, the first column is the sample ID, the last column is the label, and the Middle 200 column is the property.

#Coding:utf-8ImportNumPy as NPImportPandas as PD fromSklearn.preprocessingImportMinmaxscalerImportXgboost as XGB fromRandomImportShuffle fromXgboost.sklearnImportXgbclassifier fromSklearn.cross_validationImportCross_val_scoreImportPickleImport Time fromHyperoptImportfmin, TPE, Hp,space_eval,rand,trials,partial,status_okdefLoadFile (FileName ="E://zalei//browsetop200pca.csv"): Data= Pd.read_csv (filename,header=None) Data=data.valuesreturnDatadata=loadFile () label= Data[:,-1]attrs= Data[:,:-1]labels= Label.reshape ((1,-1)) Label=labels.tolist () [0]minmaxscaler=Minmaxscaler () attrs=minmaxscaler.fit_transform (attrs) Index=Range (0,len (label)) Shuffle (index) Trainindex= Index[:int (len (label) *0.7)]PrintLen (trainindex) Testindex= Index[int (len (label) *0.7):]PrintLen (testindex) Attr_train=Attrs[trainindex,:]Printattr_train.shapeattr_test=Attrs[testindex,:]PrintAttr_test.shapelabel_train=labels[:,trainindex].tolist () [0]PrintLen (label_train) label_test=labels[:,testindex].tolist () [0]PrintLen (label_test)PrintNp.mat (Label_train). Reshape (( -1,1). ShapedefGBM (argsdict): Max_depth= argsdict["max_depth"] + 5n_estimators= argsdict['n_estimators'] * 5 + 50learning_rate= argsdict["learning_rate"] * 0.02 + 0.05subsample= argsdict["subsample"] * 0.1 + 0.7Min_child_weight= argsdict["Min_child_weight"]+1Print "max_depth:"+Str (max_depth)Print "N_estimator:"+Str (n_estimators)Print "learning_rate:"+Str (learning_rate)Print "subsample:"+Str (subsample)Print "Min_child_weight:"+Str (min_child_weight)GlobalAttr_train,label_train GBM= XGB. Xgbclassifier (nthread=4,#Number of processesMax_depth=max_depth,#Maximum DepthN_estimators=n_estimators,#the number of treesLearning_rate=learning_rate,#Learning RateSubsample=subsample,#Number of SamplesMin_child_weight=min_child_weight,#Number of childrenMax_delta_step = 10,#10 steps without descending, then stop .Objective="binary:logistic") Metric= Cross_val_score (gbm,attr_train,label_train,cv=5,scoring="ROC_AUC"). Mean ()PrintMetricreturn-Metricspace= {"max_depth": Hp.randint ("max_depth", 15),         "n_estimators": Hp.randint ("n_estimators", 10),#[0,1,2,3,4,5], [a]         "learning_rate": Hp.randint ("learning_rate", 6),#[0,1,2,3,4,5], 0.05,0.06         "subsample": Hp.randint ("subsample", 4),#[0,1,2,3], [0.7,0.8,0.9,1.0]         "Min_child_weight": Hp.randint ("Min_child_weight", 5),#}algo= Partial (Tpe.suggest,n_startup_jobs=1) Best= Fmin (gbm,space,algo=algo,max_evals=4)Print BestPrintGBM (BEST)

Detailed reference: http://blog.csdn.net/qq_34139222/article/details/60322995

Python assistant Artifact Hyperopt

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More