Practice of deep Learning algorithm---convolutional neural Network (CNN) implementation

Last Update:2016-08-30 Source: Internet

Author: User

Tags theano

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

After figuring out the fundamentals of convolutional Neural Networks (CNN), in this post we will discuss the algorithm implementation techniques based on Theano. We will also use mnist handwritten numeral recognition as an example to create a convolutional neural network (CNN) to train the network so that the recognition error reaches within 1%.

We first need to read the set of training samples in mnist handwritten numeral recognition, for which we define a tool class:

From __future__ import print_function__docformat__ = ' restructedtext en ' import six.moves.cPickle as Pickleimport Gzipimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tclass MnistLoader (object): Def Lo Ad_data (Self, DataSet): Data_dir, data_file = Os.path.split (DataSet) If Data_dir = = ' and not Os.path.isfil                E (DataSet): New_path = Os.path.join (Os.path.split (__file__) [0], "...",                "Data", DataSet) if Os.path.isfile (new_path) or data_file = = ' mnist.pkl.gz ': DataSet = New_path if (not Os.path.isfile (DataSet)) and data_file = = ' mnist.pkl.gz ': from S Ix.moves Import Urllib origin = (' HTTP://WWW.IRO.UMONTREAL.CA/~LISA/DEEP/DATA/MNIST/MNIST.PKL.G Z ') print (' Downloading data from%s '% origin) Urllib.request.urlretrieve (origin, Datase T) print (' ... loading data') # Load the DataSet with Gzip.open (DataSet, ' RB ') as F:try:train_set, Valid_se T, Test_set = Pickle.load (f, encoding= ' latin1 ') Except:train_set, valid_set, Test_set = pickle. Load (f) def shared_dataset (Data_xy, borrow=true): data_x, data_y = data_xy shared_x = Theano.                                 GKFX (Numpy.asarray (data_x, Dtype=theano.config.floatx),                                               Borrow=borrow) shared_y = theano.shared (Numpy.asarray (data_y, DTYPE=THEANO.CONFIG.FLOATX), Borrow=borrow) return shared_x, T.cast ( Shared_y, ' int32 ') test_set_x, test_set_y = Shared_dataset (test_set) valid_set_x, valid_set_y = Shared_datas ET (Valid_set) train_set_x, train_set_y = Shared_dataset (train_set) rval = [(train_set_x, Train_set_y), (Vali  D_set_x, valid_set_y),          (test_set_x, test_set_y)] Return Rval

This class has been used before and is not explained in detail here. This class is defined separately because if we change the problem to a different type, we only need to modify this class to implement the training data loading, which simplifies the program modification effort.

The method we use is to connect the image to the convolutional neural network, then to the hidden layer of the BP network and then to the output layer of the logistic regression, so we need to define the hidden layer and the logistic regression output layer of the multilayer forward network first. The definition of the hidden layer is as follows:

From __future__ import print_function__docformat__ = ' restructedtext en ' import osimport sysimport timeitimport Numpyimport theanoimport theano.tensor as tfrom logistic_regression import logisticregression# Start-snippet-1class        Hiddenlayer (object): Def __init__ (self, rng, input, n_in, N_out, W=none, B=none, Activation=t.tanh):                    Self.input = input if W is none:w_values = Numpy.asarray (Rng.uniform ( LOW=-NUMPY.SQRT (6./(n_in + n_out)), HIGH=NUMPY.SQRT (6./(n_in + n_out)), S Ize= (n_in, N_out)), dtype=theano.config.floatx) if activation = = the        Ano.tensor.nnet.sigmoid:W_values *= 4 W = theano.shared (value=w_values, name= ' W ', borrow=true) If B is None:b_values = Numpy.zeros ((n_out,), dtype=theano.config.floatx) b = theano.shared (   Value=b_values, name= ' B ', borrow=true)     Self. w = w self.b = b lin_output = T.dot (input, self.        W) + self.b Self.output = (lin_output if activation is None else activation (lin_output) ) # Parameters of the Model self.params = [self. W, self.b]

Next we define the logistic regression algorithm class:

From __future__ import print_function__docformat__ = ' restructedtext en ' import six.moves.cPickle as Pickleimport      Gzipimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tclass LogisticRegression (object): def __init__ (self, input, n_in, n_out): self. W = theano.shared (Value=numpy.zeros (n_in, n_out), DTYPE=THEANO.CONFIG.FL              OATX), name= ' W ', borrow=true) self.b = theano.shared (              Value=numpy.zeros (N_out,), Dtype=theano.config.floatx), Name= ' B ', borrow=true) self.p_y_given_x = T.nnet.softmax (T.dot (Input, self. W) + self.b) self.y_pred = T.argmax (self.p_y_given_x, Axis=1) self.params = [self. W, self.b] self.input = input print ("Yantao: ***********************************") def Negative_log _likelihood (Self, y): Return-t.mean (T.log (self.p_y_given_x) [T.arange (Y.shape[0]), y]) def errors (self, y): I F Y.ndim! = Self.y_pred.ndim:raise TypeError (' Y should have the same shape as self.y_pred '              , (' Y ', y.type, ' y_pred ', self.y_pred.type)) if Y.dtype.startswith (' int '):   Return T.mean (T.neq (self.y_pred, y)) else:raise notimplementederror ()

This code has been discussed in detail in logistic regression blog post, which is not repeated here, and interested readers can view this blog post (the logistic regression algorithm implementation).

Having done the preparatory work, we can begin the Convolutional neural Network (CNN) implementation.

Let's first define the definition of convolutional neural Network (CNN) based on the simplified version of Lenet5, as shown in the code below:

From __future__ import print_functionimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tfr Om theano.tensor.signal import poolfrom theano.tensor.nnet import Conv2dclass Lenetconvpoollayer (object): Def __init__ ( Self, rng, input, Filter_shape, Image_shape, poolsize= (2, 2)): assert image_shape[1] = = filter_shape[1] Self . Input = Input fan_in = Numpy.prod (filter_shape[1:]) Fan_out = (filter_shape[0] * Numpy.prod (filter_shape[2: ])///Numpy.prod (poolsize)) W_bound = NUMPY.SQRT (6./(fan_in + fan_out)) self.                W = theano.shared (Numpy.asarray (Rng.uniform (Low=-w_bound, High=w_bound, Size=filter_shape), DTYPE=THEANO.CONFIG.FLOATX), borrow=true) b_values = Numpy.zeros ((filte R_shape[0],), dtype=theano.config.floatx) self.b = theano.shared (value=b_values, borrow=true) conv_out = con        V2d (Input=input,    Filters=self.            W, Filter_shape=filter_shape, input_shape=image_shape) pooled_out = pool.pool_2d ( Input=conv_out, Ds=poolsize, ignore_border=true) self.output = T.tanh (pooled_ Out + self.b.dimshuffle (' x ', 0, ' x ', ' X ')) Self.params = [self. W, self.b] Self.input = input

The above code implements the convolution operation of the input signal and maximizes the pooling of the results.

Here we see how to initialize the Lenet layer, how to convert the lenet layer output signal to the MLP network hidden layer input signal, the specific code is as follows:

Layer0 = Lenetconvpoollayer (        rng,        input=layer0_input,        image_shape= (batch_size, 1, +),        Filter_ Shape= (Nkerns[0], 1, 5, 5),        poolsize= (2, 2)    )

As shown above, our input signal is 28*28 black and white image, and we use the bulk learning, so the input image is defined as (batch_size, 1, 28, 28), we have the image of the 5*5 convolution operation, according to the convolution operation definition, the resulting convolution output layer is (28-5+ 1,28-5+1) = (24,24) "image", we take the 2*2 maximum pooling operation, that is, take the maximum value of the 2*2 region pixel as a new pixel value, the final output layer will be 12*12 output signal.

Next, we will continue to input the output signal into a lenet convolution pool layer, the code is as follows:

    Layer1 = Lenetconvpoollayer (        rng,        input=layer0.output,        image_shape= (Batch_size, Nkerns[0], 12, 12),        filter_shape= (nkerns[1], nkerns[0], 5, 5),        poolsize= (2, 2)    )

As shown above, when the input signal changes to 12*12 image, we also use the 5*5 convolution kernel, you can get (12-5+1, 12-5+1) = (8,8) image, using the 2*2 maximum pooling operation, the image is obtained. You can enter the hidden layer of MLP by calling Layer1.output.flatten (2) to turn it into a one-dimensional signal.

Below we define the Lenet engine to implement loading data, define the network model, train the network to work, and the code looks like this:

From __future__ import print_functionimport osimport sysimport timeitimport numpyimport theanoimport theano.tensor as Tfr Om theano.tensor.signal import poolfrom theano.tensor.nnet import conv2dfrom mnist_loader import Mnistloaderfrom Logistic_regression Import logisticregressionfrom hidden_layer import hiddenlayerfrom lenet_conv_pool_layer Import Lenetconvpoollayerclass Lenetmnistengine (object): Def __init__ (self): print ("Create Lenetmnistengine") def TR         Ain_model (self): Learning_rate = 0.1 N_epochs = $ dataset = ' mnist.pkl.gz ' nkerns = [20, 50] Batch_size = $ (n_train_batches, n_test_batches, N_valid_batches, Train_model, test_m Odel, Validate_model) = Self.build_model (Learning_rate, N_epochs, DataSet, Nke RNs, Batch_size) Self.train (N_epochs, N_train_batches, N_test_batches, N_valid_batches, Train_m Odel, Test_model, ValIdate_model) def run (self): print ("Run the Model") classifier = pickle.load (open (' best_model.pkl ', ' RB ')) Predict_model = Theano.function (Inputs=[classifier.input], outputs=classifier.logregression layer.y_pred) dataset= ' mnist.pkl.gz ' loader = Mnistloader () datasets = Loader.load_data (datas ET) test_set_x, test_set_y = datasets[2] test_set_x = Test_set_x.get_value () predicted_values = Predi Ct_model (Test_set_x[:10]) print ("Predicted values for the first, examples in test set:") Print (Predicted_v                        alues) def build_model (self, learning_rate=0.1, n_epochs=200, dataset= ' mnist.pkl.gz ',        NKERNS=[20, batch_size=500): rng = numpy.random.RandomState (23455) loader = Mnistloader () datasets = Loader.load_data (DataSet) train_set_x, train_set_y = datasets[0] valid_set_x, valid_set_y = Datasets[1] test_set_x, test_set_y = datasets[2] n_train_batches = Train_set_x.get_value (borrow=true). Shape[0] N_valid_batc        hes = Valid_set_x.get_value (borrow=true). shape[0] N_test_batches = Test_set_x.get_value (borrow=true). Shape[0]  N_train_batches//= batch_size n_valid_batches//= batch_size n_test_batches//= batch_size index = T.lscalar () x = T.matrix (' x ') y = t.ivector (' y ') print (' ... building the model ') layer0_ input = X.reshape ((batch_size, 1, +)) Layer0 = Lenetconvpoollayer (RNG, INPUT=LAYER0_INP UT, image_shape= (batch_size, 1, 1, 5, 5), filter_shape= (Nkerns[0], poolsize= (2, 2)) Layer1 = Lenetconvpoollayer (rng, Input=layer0.output, image_shape= (BA        Tch_size, Nkerns[0], filter_shape=, Nkerns[0 (Nkerns[1], poolsize=], 5, 5), (2, 2)) Layer2_input = LayeR1.output.flatten (2) Layer2 = Hiddenlayer (rng, Input=layer2_input, n_in=nkerns[1] * 4 * 4, n_out=500, Activation=t.tanh) Layer3 = Logisticregression (input=layer2.outp UT, n_in=500, n_out=10) cost = Layer3.negative_log_likelihood (y) Test_model = theano.function ([i Ndex], layer3.errors (y), givens={X:test_set_x[index * batch_size: (index + 1) * bat Ch_size], Y:test_set_y[index * batch_size: (index + 1) * Batch_size]}) Validate _model = Theano.function ([index], layer3.errors (y), givens={X:valid_set _x[index * Batch_size: (index + 1) * Batch_size], Y:valid_set_y[index * batch_size: (index + 1) * Batch_si Ze]}) params = Layer3.params + layer2.params + layer1.params + layer0.params grads = t.g       rad (cost, params) Updates = [(Param_i, Param_i-learning_rate * grad_i) for param_i, grad_i in zip (params, grads)            ] Train_model = theano.function ([index], cost, updates=updates, givens={X:train_set_x[index * batch_size: (index + 1) * Batch_size], Y:train_set_y[index * Batch_size: (index + 1) * Batch_size]}) return (N_train_batches, n_test_batches, N_VALID_BATC Hes, Train_model, Test_model, Validate_model) def train (self, n_epochs, n_train_batches, N_test_bat Ches, N_valid_batches, Train_model, Test_model, Validate_model): Print (' ... training ') Patie NCE = 10000 Patience_increase = 2 Improvement_threshold = 0.995 validation_frequency = min (n_train_ba        Tches, Patience//2) Best_validation_loss = Numpy.inf Best_iter = 0 Test_score = 0. Start_time = Timeit.default_timer() Epoch = 0 done_looping = False while (Epoch < n_epochs) and (not done_looping): EPOC h = epoch + 1 for Minibatch_index in range (n_train_batches): iter = (epoch-1) * N_train_batche                S + minibatch_index if iter% = = 0:print (' training @ iter = ', iter) Cost_ij = Train_model (Minibatch_index) if (iter + 1)% Validation_frequency = = 0:valid                    Ation_losses = [Validate_model (i) for I in Range (n_valid_batches)] This_validation_loss = Numpy.mean (validation_losses) print (' Epoch%i, Minibatch%i/%i, validation E Rror%f percent '% (epoch, Minibatch_index + 1, n_train_batches, This_valida                    Tion_loss * 100.)) If This_validation_loss < Best_validation_loss:if This_validation_loss < Best_Validation_loss * improvement_threshold:patience = max (Patience, I ter * patience_increase) Best_validation_loss = This_validation_loss Best_it  ER = iter test_losses = [Test_model (i) for I                        In range (n_test_batches)] Test_score = Numpy.mean (test_losses)                        With open (' Best_model.pkl ', ' WB ') as F:pickle.dump (classifier, F)                              Print (' Epoch%i, Minibatch%i/%i, test error of ' best model%f percent ')%                (Epoch, Minibatch_index + 1, n_train_batches, Test_score * 100.)) If patience <= iter:done_looping = True break end_time = time       It.default_timer () Print (' optimization complete. ')              Print (' Best validation score of%f percent obtained at iteration%i, ' with test performance%f% '%        (Best_validation_loss *, Best_iter + 1, Test_score * 100.)) Print (' The code for file ' + os.path.split (__file__) [1] + ' ran for%.2fm '% ((end_time-sta Rt_time)), File=sys.stderr)

The code above is similar to the training code of the previous MLP and is no longer discussed here. On my Mac notebook, running for about 6 hours, you get a result with an error rate of less than 1%.

Practice of deep Learning algorithm---convolutional neural Network (CNN) implementation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More