Python assistant Artifact Hyperopt

Source: Internet
Author: User
Tags shuffle xgboost

First, installation

Pip Install hyperopt

Ii. description

The hyperopt provides an optimized interface that accepts an evaluation function and parameter space to calculate the loss function value of a point within the parameter space. The user also specifies the distribution of the parameters within the space.
Hyheropt Four Important factors: Specify the function that needs to be minimized, the space to search, the sampled data set (Trails database) (optional), the algorithm for searching (optional).
First, define a target function, accept a variable, and then return the loss value of a function, such as minimizing the function q (x, y) = x**2 + y**2

Specifies the search algorithm, which is the value of the Algo parameter of the hyperopt fmin function. Currently supported algorithms are randomly searched (corresponding to hyperopt.rand.suggest), simulated annealing (corresponding to hyperopt.anneal.suggest), TPE algorithm.

For parameter space settings, such as optimization function q, input fmin (Q,space=hp.uniform (' a ', 0,1)). The first parameter of the Hp.uniform function is the label, and each parameter must have a unique label in the argument space. HP.UNIFORM specifies the distribution of the parameters. Other parameter distributions, such as


Hp.choice Returns an option that can be a list or tuple.options can be a nested expression that is used to form a conditional parameter.
Hp.pchoice (label,p_options) returns an option for p_options with a certain probability. This option allows the function to be uneven in the search process for each option.
The Hp.uniform (label,low,high) parameter is evenly distributed between low and high.
Hp.quniform (LABEL,LOW,HIGH,Q), the value of the parameter is round (uniform (low,high)/q) *q, applicable to those discrete values.
Hp.loguniform (Label,low,high) draws exp (Uniform (Low,high)), the value range of the variable is [exp (low), exp (high)]
Hp.randint (label,upper) returns a random integer in the interval that is closed before [0,upper].

Search spaces can contain lists and dictionary.

from hyperopt Import hplist_space = [Hp.uniform (' a ', 0, 1), Hp.loguniform (' B ', 0, 1)]tuple_space = (Hp.uniform (' a ', 0 , 1), Hp.loguniform (' B ', 0, Span class= "Hljs-number" >1)) Dict_space = {' A ': Hp.uniform (' a ', 0, 1), ' B ': Hp.loguniform (' B ' , 0, 1)} 

Third, simple example

 from Import   Hp,fmin, Rand, TPE, Space_evaldef  Q (args):    = args    return x**2-2*x+1 + y**2= [Hp.randint ('x', 5), Hp.randint (' y ', 5= fmin (q,space,algo=rand.suggest,max_evals=10)print(best)

Output:

{' X ': 2, ' Y ': 0}

Iv. Examples of Xgboost

Xgboost has a lot of parameters, the Xgboost code is written into a function, and then passed into the fmin for parameter optimization, cross-validated AUC as the optimization goal. The larger the AUC, the better, because fmin is the minimum value, so the minimum value of-AUC is obtained. The dataset used is a 202-column dataset, the first column is the sample ID, the last column is the label, and the Middle 200 column is the property.

#Coding:utf-8ImportNumPy as NPImportPandas as PD fromSklearn.preprocessingImportMinmaxscalerImportXgboost as XGB fromRandomImportShuffle fromXgboost.sklearnImportXgbclassifier fromSklearn.cross_validationImportCross_val_scoreImportPickleImport Time fromHyperoptImportfmin, TPE, Hp,space_eval,rand,trials,partial,status_okdefLoadFile (FileName ="E://zalei//browsetop200pca.csv"): Data= Pd.read_csv (filename,header=None) Data=data.valuesreturnDatadata=loadFile () label= Data[:,-1]attrs= Data[:,:-1]labels= Label.reshape ((1,-1)) Label=labels.tolist () [0]minmaxscaler=Minmaxscaler () attrs=minmaxscaler.fit_transform (attrs) Index=Range (0,len (label)) Shuffle (index) Trainindex= Index[:int (len (label) *0.7)]PrintLen (trainindex) Testindex= Index[int (len (label) *0.7):]PrintLen (testindex) Attr_train=Attrs[trainindex,:]Printattr_train.shapeattr_test=Attrs[testindex,:]PrintAttr_test.shapelabel_train=labels[:,trainindex].tolist () [0]PrintLen (label_train) label_test=labels[:,testindex].tolist () [0]PrintLen (label_test)PrintNp.mat (Label_train). Reshape (( -1,1). ShapedefGBM (argsdict): Max_depth= argsdict["max_depth"] + 5n_estimators= argsdict['n_estimators'] * 5 + 50learning_rate= argsdict["learning_rate"] * 0.02 + 0.05subsample= argsdict["subsample"] * 0.1 + 0.7Min_child_weight= argsdict["Min_child_weight"]+1Print "max_depth:"+Str (max_depth)Print "N_estimator:"+Str (n_estimators)Print "learning_rate:"+Str (learning_rate)Print "subsample:"+Str (subsample)Print "Min_child_weight:"+Str (min_child_weight)GlobalAttr_train,label_train GBM= XGB. Xgbclassifier (nthread=4,#Number of processesMax_depth=max_depth,#Maximum DepthN_estimators=n_estimators,#the number of treesLearning_rate=learning_rate,#Learning RateSubsample=subsample,#Number of SamplesMin_child_weight=min_child_weight,#Number of childrenMax_delta_step = 10,#10 steps without descending, then stop .Objective="binary:logistic") Metric= Cross_val_score (gbm,attr_train,label_train,cv=5,scoring="ROC_AUC"). Mean ()PrintMetricreturn-Metricspace= {"max_depth": Hp.randint ("max_depth", 15),         "n_estimators": Hp.randint ("n_estimators", 10),#[0,1,2,3,4,5], [a]         "learning_rate": Hp.randint ("learning_rate", 6),#[0,1,2,3,4,5], 0.05,0.06         "subsample": Hp.randint ("subsample", 4),#[0,1,2,3], [0.7,0.8,0.9,1.0]         "Min_child_weight": Hp.randint ("Min_child_weight", 5),#}algo= Partial (Tpe.suggest,n_startup_jobs=1) Best= Fmin (gbm,space,algo=algo,max_evals=4)Print BestPrintGBM (BEST)

Detailed reference: http://blog.csdn.net/qq_34139222/article/details/60322995
 

Python assistant Artifact Hyperopt

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.