This article mainly wants to introduce how to use the Scikit-learn grid search function, and gives a set of code examples. You can copy and paste the code into your own project as the start of the project.
List of topics covered below: How to use Keras in the Scikit-learn model. How to use Grid search in the Scikit-learn model. How to tune batch size and training epochs. How to tune the optimization algorithm. How to tune the learning rate and momentum factor. How to determine the network weight value initial value. How to select a Neuron activation function. How to tune the dropout regularization. How to determine the number of neurons in a hidden layer. How to use Keras in the Scikit-learn model
It can be used for Scikit-learn by wrapping the Keras model with the Kerasclassifier or Kerasregressor class.
To use these wrappers, you must define a function to create and return Keras in sequential mode, and then pass the function to the BUILD_FN parameter when you build the Kerasclassifier class.
For example:
Def create_model ():
...
return model
model = Kerasclassifier (Build_fn=create_model)
The builder of the Kerasclassifier class can take the default parameters and pass them to the calling function of Model.fit (), such as the number of epochs and batch size (batch size).
For example:
Def create_model ():
...
return model
= Kerasclassifier (Build_fn=create_model, nb_epoch=10)
The construction of the Kerasclassifier class can also use the new parameters so that it can be passed to the custom Create_model () function. These new parameters must also be defined by the signature of the Create_model () function that uses the default parameters.
For example:
def create_model (dropout_rate=0.0):
...
return model
= Kerasclassifier (Build_fn=create_model, dropout_rate=0.2)
You can learn more about the Scikit-learn wrapper in the Keras API documentation. How to use grid search in the Scikit-learn model
Grid search is a model-based hyper-parametric optimization technique.
In Scikit-learn, the technology is provided by the GRIDSEARCHCV class.
When constructing the class, you must provide a hyper-parameter dictionary to evaluate the Param_grid parameter. This is a schematic diagram of the model parameter name and a large number of column values.
By default, precision is the core of optimization, but other cores can specify the score parameter for GRIDSEARCHCV constructors.
By default, a grid search uses only one thread. In the Gridsearchcv constructor, by setting the N_jobs parameter to-1, the process uses all the cores on the computer. This depends on your keras back end and may interfere with the training process of the main neural network.
GRIDSEARCHCV works when you construct and evaluate the combination of individual parameters in a model. Each individual model is evaluated using cross-validation, and 3-layer cross-validation is used by default, although it is possible to overwrite the CV parameter by assigning it to the GRIDSEARCHCV constructor.
Here is an example of defining a simple grid search:
Param_grid = Dict (nb_epochs=[10,20,30])
Grid = GRIDSEARCHCV (Estimator=model, Param_grid=param_grid, N_jobs=-1)
Grid_result = Grid.fit (X, Y)
Once done, you can access the output of the grid search, which comes from the result object, returned by Grid.fit (). The BEST_SCORE_ members provide the best scores observed during the optimization process, and Best_params_ describes the combination of parameters that have obtained the best results.
You can learn more about the GRIDSEARCHCV class in the Scikit-learn API documentation. Problem Description
Now we know how to use the Scikit-learn Keras model and how to use the Scikit-learn grid search. Now take a look at the following example.
All examples will be demonstrated on a small standard machine learning dataset called the Pima Indians onset of the diabetes categorical dataset. The small data set includes all the easy-to-work numeric attributes.
Download the dataset and place it in your current working directory, named: Pima-indians-diabetes.csv.
When we follow the examples in this article, we can get the best parameters. Because parameters can interact with each other, this is not the best method for grid search, but it is a good way for demonstration purposes. note parallelization of grid search
All examples are configured in order to implement Parallelization (N_jobs=-1).
If you see an error such as the following:
Info (theano.gof.compilelock): Waiting for existing lock by process ' 55614 ' (I am process ' 55613 ')
info (theano.gof.co Mpilelock): To manually release the lock, delete ...
End the process and modify the code so that the grid search is not performed in parallel, setting the N_jobs=1. How to tune batch size and training epochs
In the first simple example, when adjusting the network, we focus on adjusting the batch size and training epochs.
The batch size of the iteration gradient descent is the number of patterns that are displayed to the network before the weight is updated. It is also an optimization method in network training, which defines the number of patterns read and remains in memory.
Training Epochs is the number of times the entire training data set is displayed to the network during training. Some networks are sensitive to batch size, such as lstm recurrent neural networks and convolutional neural networks.
Here, we will progressively evaluate the different micro-batch sizes from 10 to 100 with a 20 step.
The complete code is as follows:
# Use Scikit-learn to Grid search the batch size and epochs import numpy from Sklearn.grid_search import GRIDSEARCHCV from Keras.models import sequential from keras.layers import dense from keras.wrappers.scikit_learn import Kerasclassifier # F
Unction to create model, required for Kerasclassifier def Create_model (): # Create Model model = sequential ()
Model.add (Dense (input_dim=8, activation= ' Relu ')) Model.add (dense (1, activation= ' sigmoid ')) # Compile model Model.compile (loss= ' binary_crossentropy ', optimizer= ' Adam ', metrics=[' accuracy ']) return Model # fix random seed for Reproducibility seed = 7 Numpy.random.seed (SEED) # load DataSet DataSet = Numpy.loadtxt ("Pima-indians-diabetes.csv", Deli Miter= ",") # split into input (x) and output (y) variables X = dataset[:,0:8] Y = dataset[:,8] # Create model model = Kera Sclassifier (Build_fn=create_model, verbose=0) # define the grid search parameters batch_size = [Ten, F, +, +, +] E Pochs = [ten, +] parAm_grid = Dict (batch_size=batch_size, nb_epoch=epochs) Grid = GRIDSEARCHCV (Estimator=model, Param_grid=param_grid, N_ Jobs=-1) Grid_result = Grid.fit (X, Y) # Summarize results print ("Best:%f using%s"% (Grid_result.best_score_, Grid_resul T.BEST_PARAMS_)) for the params, Mean_score, scores in Grid_result.grid_scores_: Print ("%f (%f) with:%r"% (Scores.mean () , SCORES.STD (), params))
After running the output is as follows:
best:0.686198 using {' Nb_epoch ': +, ' batch_size ': $0.348958 (0.024774) with: {' Nb_epoch ': Ten, ' Batch_size ': 10} 0.34 8958 (0.024774) with: {' Nb_epoch ': +, ' batch_size ': ten} 0.466146 (0.149269) with: {' Nb_epoch ': +, ' batch_size ': 10} 0.6 47135 (0.021236) with: {' Nb_epoch ': Ten, ' batch_size ': +} 0.660156 (0.014616) with: {' Nb_epoch ': ' Batch_size ': 20} 0.6 86198 (0.024774) with: {' Nb_epoch ': +, ' batch_size ': +} 0.489583 (0.075566) with: {' Nb_epoch ': Ten, ' Batch_size ': 40} 0. 652344 (0.019918) with: {' Nb_epoch ': +, ' batch_size ': +} 0.654948 (0.027866) with: {' Nb_epoch ': +, ' batch_size ': 40} 0 .518229 (0.032264) with: {' Nb_epoch ': Ten, ' Batch_size ': $0.605469 (0.052213) with: {' Nb_epoch ': ' Batch_size ': 60} 0 .665365 (0.004872) with: {' Nb_epoch ': +, ' batch_size ': 60} 0.537760 (