After figuring out the fundamentals of convolutional Neural Networks (CNN), in this post we will discuss the algorithm implementation techniques based on Theano. We will also use mnist handwritten numeral recognition as an example to create a convolutional neural network (CNN) to train the network so that the recognition error reaches within 1%.
We first need to read the set of training samples in mnist handwritten numeral recognition, for which we define a tool class:
From __future__ import print_function__docformat__ = ' restructedtext en ' import six.moves.cPickle as Pickleimport Gzipimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tclass MnistLoader (object): Def Lo Ad_data (Self, DataSet): Data_dir, data_file = Os.path.split (DataSet) If Data_dir = = ' and not Os.path.isfil E (DataSet): New_path = Os.path.join (Os.path.split (__file__) [0], "...", "Data", DataSet) if Os.path.isfile (new_path) or data_file = = ' mnist.pkl.gz ': DataSet = New_path if (not Os.path.isfile (DataSet)) and data_file = = ' mnist.pkl.gz ': from S Ix.moves Import Urllib origin = (' HTTP://WWW.IRO.UMONTREAL.CA/~LISA/DEEP/DATA/MNIST/MNIST.PKL.G Z ') print (' Downloading data from%s '% origin) Urllib.request.urlretrieve (origin, Datase T) print (' ... loading data') # Load the DataSet with Gzip.open (DataSet, ' RB ') as F:try:train_set, Valid_se T, Test_set = Pickle.load (f, encoding= ' latin1 ') Except:train_set, valid_set, Test_set = pickle. Load (f) def shared_dataset (Data_xy, borrow=true): data_x, data_y = data_xy shared_x = Theano. GKFX (Numpy.asarray (data_x, Dtype=theano.config.floatx), Borrow=borrow) shared_y = theano.shared (Numpy.asarray (data_y, DTYPE=THEANO.CONFIG.FLOATX), Borrow=borrow) return shared_x, T.cast ( Shared_y, ' int32 ') test_set_x, test_set_y = Shared_dataset (test_set) valid_set_x, valid_set_y = Shared_datas ET (Valid_set) train_set_x, train_set_y = Shared_dataset (train_set) rval = [(train_set_x, Train_set_y), (Vali D_set_x, valid_set_y), (test_set_x, test_set_y)] Return Rval
This class has been used before and is not explained in detail here. This class is defined separately because if we change the problem to a different type, we only need to modify this class to implement the training data loading, which simplifies the program modification effort.
The method we use is to connect the image to the convolutional neural network, then to the hidden layer of the BP network and then to the output layer of the logistic regression, so we need to define the hidden layer and the logistic regression output layer of the multilayer forward network first. The definition of the hidden layer is as follows:
From __future__ import print_function__docformat__ = ' restructedtext en ' import osimport sysimport timeitimport Numpyimport theanoimport theano.tensor as tfrom logistic_regression import logisticregression# Start-snippet-1class Hiddenlayer (object): Def __init__ (self, rng, input, n_in, N_out, W=none, B=none, Activation=t.tanh): Self.input = input if W is none:w_values = Numpy.asarray (Rng.uniform ( LOW=-NUMPY.SQRT (6./(n_in + n_out)), HIGH=NUMPY.SQRT (6./(n_in + n_out)), S Ize= (n_in, N_out)), dtype=theano.config.floatx) if activation = = the Ano.tensor.nnet.sigmoid:W_values *= 4 W = theano.shared (value=w_values, name= ' W ', borrow=true) If B is None:b_values = Numpy.zeros ((n_out,), dtype=theano.config.floatx) b = theano.shared ( Value=b_values, name= ' B ', borrow=true) Self. w = w self.b = b lin_output = T.dot (input, self. W) + self.b Self.output = (lin_output if activation is None else activation (lin_output) ) # Parameters of the Model self.params = [self. W, self.b]
Next we define the logistic regression algorithm class:
From __future__ import print_function__docformat__ = ' restructedtext en ' import six.moves.cPickle as Pickleimport Gzipimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tclass LogisticRegression (object): def __init__ (self, input, n_in, n_out): self. W = theano.shared (Value=numpy.zeros (n_in, n_out), DTYPE=THEANO.CONFIG.FL OATX), name= ' W ', borrow=true) self.b = theano.shared ( Value=numpy.zeros (N_out,), Dtype=theano.config.floatx), Name= ' B ', borrow=true) self.p_y_given_x = T.nnet.softmax (T.dot (Input, self. W) + self.b) self.y_pred = T.argmax (self.p_y_given_x, Axis=1) self.params = [self. W, self.b] self.input = input print ("Yantao: ***********************************") def Negative_log _likelihood (Self, y): Return-t.mean (T.log (self.p_y_given_x) [T.arange (Y.shape[0]), y]) def errors (self, y): I F Y.ndim! = Self.y_pred.ndim:raise TypeError (' Y should have the same shape as self.y_pred ' , (' Y ', y.type, ' y_pred ', self.y_pred.type)) if Y.dtype.startswith (' int '): Return T.mean (T.neq (self.y_pred, y)) else:raise notimplementederror ()
This code has been discussed in detail in logistic regression blog post, which is not repeated here, and interested readers can view this blog post (the logistic regression algorithm implementation).
Having done the preparatory work, we can begin the Convolutional neural Network (CNN) implementation.
Let's first define the definition of convolutional neural Network (CNN) based on the simplified version of Lenet5, as shown in the code below:
From __future__ import print_functionimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tfr Om theano.tensor.signal import poolfrom theano.tensor.nnet import Conv2dclass Lenetconvpoollayer (object): Def __init__ ( Self, rng, input, Filter_shape, Image_shape, poolsize= (2, 2)): assert image_shape[1] = = filter_shape[1] Self . Input = Input fan_in = Numpy.prod (filter_shape[1:]) Fan_out = (filter_shape[0] * Numpy.prod (filter_shape[2: ])///Numpy.prod (poolsize)) W_bound = NUMPY.SQRT (6./(fan_in + fan_out)) self. W = theano.shared (Numpy.asarray (Rng.uniform (Low=-w_bound, High=w_bound, Size=filter_shape), DTYPE=THEANO.CONFIG.FLOATX), borrow=true) b_values = Numpy.zeros ((filte R_shape[0],), dtype=theano.config.floatx) self.b = theano.shared (value=b_values, borrow=true) conv_out = con V2d (Input=input, Filters=self. W, Filter_shape=filter_shape, input_shape=image_shape) pooled_out = pool.pool_2d ( Input=conv_out, Ds=poolsize, ignore_border=true) self.output = T.tanh (pooled_ Out + self.b.dimshuffle (' x ', 0, ' x ', ' X ')) Self.params = [self. W, self.b] Self.input = input
The above code implements the convolution operation of the input signal and maximizes the pooling of the results.
Here we see how to initialize the Lenet layer, how to convert the lenet layer output signal to the MLP network hidden layer input signal, the specific code is as follows:
Layer0 = Lenetconvpoollayer ( rng, input=layer0_input, image_shape= (batch_size, 1, +), Filter_ Shape= (Nkerns[0], 1, 5, 5), poolsize= (2, 2) )
As shown above, our input signal is 28*28 black and white image, and we use the bulk learning, so the input image is defined as (batch_size, 1, 28, 28), we have the image of the 5*5 convolution operation, according to the convolution operation definition, the resulting convolution output layer is (28-5+ 1,28-5+1) = (24,24) "image", we take the 2*2 maximum pooling operation, that is, take the maximum value of the 2*2 region pixel as a new pixel value, the final output layer will be 12*12 output signal.
Next, we will continue to input the output signal into a lenet convolution pool layer, the code is as follows:
Layer1 = Lenetconvpoollayer ( rng, input=layer0.output, image_shape= (Batch_size, Nkerns[0], 12, 12), filter_shape= (nkerns[1], nkerns[0], 5, 5), poolsize= (2, 2) )
As shown above, when the input signal changes to 12*12 image, we also use the 5*5 convolution kernel, you can get (12-5+1, 12-5+1) = (8,8) image, using the 2*2 maximum pooling operation, the image is obtained. You can enter the hidden layer of MLP by calling Layer1.output.flatten (2) to turn it into a one-dimensional signal.
Below we define the Lenet engine to implement loading data, define the network model, train the network to work, and the code looks like this:
From __future__ import print_functionimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tfr Om theano.tensor.signal import poolfrom theano.tensor.nnet import conv2dfrom mnist_loader import Mnistloaderfrom Logistic_regression Import logisticregressionfrom hidden_layer import hiddenlayerfrom lenet_conv_pool_layer Import Lenetconvpoollayerclass Lenetmnistengine (object): Def __init__ (self): print ("Create Lenetmnistengine") def TR Ain_model (self): Learning_rate = 0.1 N_epochs = $ dataset = ' mnist.pkl.gz ' nkerns = [20, 50] Batch_size = $ (n_train_batches, n_test_batches, N_valid_batches, Train_model, test_m Odel, Validate_model) = Self.build_model (Learning_rate, N_epochs, DataSet, Nke RNs, Batch_size) Self.train (N_epochs, N_train_batches, N_test_batches, N_valid_batches, Train_m Odel, Test_model, ValIdate_model) def run (self): print ("Run the Model") classifier = pickle.load (open (' best_model.pkl ', ' RB ')) Predict_model = Theano.function (Inputs=[classifier.input], outputs=classifier.logregression layer.y_pred) dataset= ' mnist.pkl.gz ' loader = Mnistloader () datasets = Loader.load_data (datas ET) test_set_x, test_set_y = datasets[2] test_set_x = Test_set_x.get_value () predicted_values = Predi Ct_model (Test_set_x[:10]) print ("Predicted values for the first, examples in test set:") Print (Predicted_v alues) def build_model (self, learning_rate=0.1, n_epochs=200, dataset= ' mnist.pkl.gz ', NKERNS=[20, batch_size=500): rng = numpy.random.RandomState (23455) loader = Mnistloader () datasets = Loader.load_data (DataSet) train_set_x, train_set_y = datasets[0] valid_set_x, valid_set_y = Datasets[1] test_set_x, test_set_y = datasets[2] n_train_batches = Train_set_x.get_value (borrow=true). Shape[0] N_valid_batc hes = Valid_set_x.get_value (borrow=true). shape[0] N_test_batches = Test_set_x.get_value (borrow=true). Shape[0] N_train_batches//= batch_size n_valid_batches//= batch_size n_test_batches//= batch_size index = T.lscalar () x = T.matrix (' x ') y = t.ivector (' y ') print (' ... building the model ') layer0_ input = X.reshape ((batch_size, 1, +)) Layer0 = Lenetconvpoollayer (RNG, INPUT=LAYER0_INP UT, image_shape= (batch_size, 1, 1, 5, 5), filter_shape= (Nkerns[0], poolsize= (2, 2)) Layer1 = Lenetconvpoollayer (rng, Input=layer0.output, image_shape= (BA Tch_size, Nkerns[0], filter_shape=, Nkerns[0 (Nkerns[1], poolsize=], 5, 5), (2, 2)) Layer2_input = LayeR1.output.flatten (2) Layer2 = Hiddenlayer (rng, Input=layer2_input, n_in=nkerns[1] * 4 * 4, n_out=500, Activation=t.tanh) Layer3 = Logisticregression (input=layer2.outp UT, n_in=500, n_out=10) cost = Layer3.negative_log_likelihood (y) Test_model = theano.function ([i Ndex], layer3.errors (y), givens={X:test_set_x[index * batch_size: (index + 1) * bat Ch_size], Y:test_set_y[index * batch_size: (index + 1) * Batch_size]}) Validate _model = Theano.function ([index], layer3.errors (y), givens={X:valid_set _x[index * Batch_size: (index + 1) * Batch_size], Y:valid_set_y[index * batch_size: (index + 1) * Batch_si Ze]}) params = Layer3.params + layer2.params + layer1.params + layer0.params grads = t.g rad (cost, params) Updates = [(Param_i, Param_i-learning_rate * grad_i) for param_i, grad_i in zip (params, grads) ] Train_model = theano.function ([index], cost, updates=updates, givens={X:train_set_x[index * batch_size: (index + 1) * Batch_size], Y:train_set_y[index * Batch_size: (index + 1) * Batch_size]}) return (N_train_batches, n_test_batches, N_VALID_BATC Hes, Train_model, Test_model, Validate_model) def train (self, n_epochs, n_train_batches, N_test_bat Ches, N_valid_batches, Train_model, Test_model, Validate_model): Print (' ... training ') Patie NCE = 10000 Patience_increase = 2 Improvement_threshold = 0.995 validation_frequency = min (n_train_ba Tches, Patience//2) Best_validation_loss = Numpy.inf Best_iter = 0 Test_score = 0. Start_time = Timeit.default_timer() Epoch = 0 done_looping = False while (Epoch < n_epochs) and (not done_looping): EPOC h = epoch + 1 for Minibatch_index in range (n_train_batches): iter = (epoch-1) * N_train_batche S + minibatch_index if iter% = = 0:print (' training @ iter = ', iter) Cost_ij = Train_model (Minibatch_index) if (iter + 1)% Validation_frequency = = 0:valid Ation_losses = [Validate_model (i) for I in Range (n_valid_batches)] This_validation_loss = Numpy.mean (validation_losses) print (' Epoch%i, Minibatch%i/%i, validation E Rror%f percent '% (epoch, Minibatch_index + 1, n_train_batches, This_valida Tion_loss * 100.)) If This_validation_loss < Best_validation_loss:if This_validation_loss < Best_Validation_loss * improvement_threshold:patience = max (Patience, I ter * patience_increase) Best_validation_loss = This_validation_loss Best_it ER = iter test_losses = [Test_model (i) for I In range (n_test_batches)] Test_score = Numpy.mean (test_losses) With open (' Best_model.pkl ', ' WB ') as F:pickle.dump (classifier, F) Print (' Epoch%i, Minibatch%i/%i, test error of ' best model%f percent ')% (Epoch, Minibatch_index + 1, n_train_batches, Test_score * 100.)) If patience <= iter:done_looping = True break end_time = time It.default_timer () Print (' optimization complete. ') Print (' Best validation score of%f percent obtained at iteration%i, ' with test performance%f% '% (Best_validation_loss *, Best_iter + 1, Test_score * 100.)) Print (' The code for file ' + os.path.split (__file__) [1] + ' ran for%.2fm '% ((end_time-sta Rt_time)), File=sys.stderr)
The code above is similar to the training code of the previous MLP and is no longer discussed here. On my Mac notebook, running for about 6 hours, you get a result with an error rate of less than 1%.
Practice of deep Learning algorithm---convolutional neural Network (CNN) implementation