Convolution neural Network (CNN) principle and implementation

Last Update:2014-05-15 Source: Internet

Author: User

Tags theano

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper combines the application of deep learning, convolution neural Network for some basic applications, referring to LeCun's document 0.1 for partial expansion, and results display (in Python).

Divided into the following parts:

1. Convolution (convolution)

2. Pooling (down sampling process)

3. CNN Structure

4. Run the experiment

The following are described separately.

PS: This blog for the ESE machine learning short-term class reference (20140516 courses), this article is only a brief talk about the most naive the simplest ideas, in the practical part, the principle of the lesson detailed.

1. Convolution (convolution)

Similar to the Gaussian convolution, the convolution of all the image in the Imagebatch. For a picture, all of its feature map is wrapped in a filter to form a feature map. As the following code, for a imagebatch (including two images), each diagram initially has 3 feature map (R,G,B), with two 9*9 filter for convolution, the result is that each graph gets two feature map.

Convolutional operations are implemented by Theano conv.conv2d, where we w,b with random parameters. The result is a bit like edge detector, isn't it?

Code: (see note)

#-*-Coding:utf-8-*-"" "Created on Sat $18:55:26 2014@author:rachelfunction:convolution option of the pictures W ith same size (width,height) input:3 feature maps (3 channels <RGB> of a picture) Convolution:two 9*9 convolutional Filters "" "from theano.tensor.nnet import convimport theano.tensor as Timport numpy, theanorng = numpy.random.RandomState (23455) # symbol Variableinput = T.TENSOR4 (name = ' input ') # initial Weightsw_shape = (2,3,9,9) #2 convolutional filters, 3 channels, Filter shape:9*9w_bound = numpy.sqrt (3*9*9) W = theano.shared (Numpy.asarray (rng.uniform (low = -1.0/w_bound, High = 1.0/w_bound,size = W_shape), Dtype = input.dtype), name = ' W ') B_shape = (2,) b = Thean O.shared (Numpy.asarray (rng.uniform (low =-.5, high =. 5, size = B_shape), Dtype = Input.dty PE), name = ' B ') conv_out = conv.conv2d (input,w) #T. Tensorvariable.dimshuffle () can reshape O R Broadcast (add dimension) #diMshuffle (self,*pattern) # >>>b1 = B.dimshuffle (' x ', 0, ' x ', ' x ') # >>>b1.shape.eval () # Array ([1,2,1,1] ) output = t.nnet.sigmoid (Conv_out + b.dimshuffle (' x ', 0, ' x ', ' X ')) F = theano.function ([input],output) # Demoimport Pylabfrom PIL Import image#minibatch_img = T.tensor4 (name = ' minibatch_img ') #-------------img1---------------img1 = Image.open (Open ('//home//rachel//documents//zju_projects//dl//dataset//rachel.jpg ')) width1,height1 = IMG1.SIZEIMG1 = Numpy.asarray (img1, dtype = ' float32 ')/256. # (height, width, 3) # put image in 4D tensor of shape (1,3,height,width) Img1_rgb = Img1.swapaxes (0,2). swapaxes E (1,3,height1,width1) # (3,height,width) #-------------img2---------------img2 = image.open (open ('//home//rachel// Documents//zju_projects//dl//dataset//rachel1.jpg ') width2,height2 = Img2.sizeimg2 = Numpy.asarray (Img2,dtype = ' Float32 ')/256.img2_rgb = Img2.swapaxes (0,2). Swapaxes (Reshape) # (1,3,HEIGHT2,WIDTH2) #minibatch _img = T.join (0,IMG1_RGB,IMG2_RGB) miNibatch_img = Numpy.concatenate ((IMG1_RGB,IMG2_RGB), Axis = 0) filtered_img = f (minibatch_img) # Plot original image and both Convoluted Resultspylab.subplot (2,3,1);p ylab.axis (' off ');p ylab.imshow (IMG1) pylab.subplot (2,3,4);p Ylab.axis (' Off ');p ylab.imshow (IMG2) Pylab.gray () Pylab.subplot (2,3,2); Pylab.axis ("Off") Pylab.imshow (filtered_img[0,0,:,:]) #0: Minibatch_index; 0:1-st Filterpylab.subplot (2,3,3); Pylab.axis ("Off") Pylab.imshow (filtered_img[0,1,:,:]) #0: Minibatch_index; 1:1-st Filterpylab.subplot (2,3,5); Pylab.axis ("Off") Pylab.imshow (filtered_img[1,0,:,:]) #0: Minibatch_index; 0:1-st Filterpylab.subplot (2,3,6); Pylab.axis ("Off") Pylab.imshow (filtered_img[1,1,:,:]) #0: Minibatch_index; 1:1-st Filterpylab.show ()

2. Pooling (down sampling process)

The most commonly used maxpooling. Two issues were resolved:

1. Reduce the amount of computation

2. Rotational invariance (reason for self-awareness)

PS: For rotation invariance, recall the next SIFT,LBP: adopt the main direction; HOG: Choose a template in different directions

The maxpooling-down sampling process will halve the length and width of the feature map. (not shown in the results chart below, Python is automatically pulled to the same size, but in fact the number of pixels is halved)

Code: (see note)

#-*-Coding:utf-8-*-"" "Created on Sat may 18:55:26 2014@author:rachelfunction:convolution option Input:3 feature Maps (3 channels <RGB> of a picture) Convolution:two 9*9 convolutional Filters "" "from theano.tensor.nnet import Conv  Import Theano.tensor as Timport numpy, theanorng = Numpy.random.RandomState (23455) # symbol Variableinput = T.TENSOR4 (name = ' input ') # initial Weightsw_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape:9*9w_bound = NUMPY.SQR                                T (3*9*9) W = theano.shared (Numpy.asarray (rng.uniform (low = -1.0/w_bound, high = 1.0/w_bound,size = W_shape),  Dtype = input.dtype), name = ' W ') B_shape = (2,) b = theano.shared (Numpy.asarray (rng.uniform (low =-.5, high =. 5, size = B_shape), Dtype = input.dtype), name = ' B ') con V_out = conv.conv2d (input,w) #T. Tensorvariable.dimshuffle () can reshape or broadcast (add dimension) #dimshuffle (self,* pattern) # &GT;&GT;&GT;B1 = B.DImshuffle (' x ', 0, ' x ', ' x ') # >>>b1.shape.eval () # Array ([1,2,1,1]) output = t.nnet.sigmoid (Conv_out + B.dimshuffle (' x ', 0, ' x ', ' X ')) F = theano.function ([input],output) # demoimport pylabfrom PIL Import Imagefrom Matplotlib.pyplot Import * #open random imageimg = Image.open ('//home//rachel//documents//zju_projects//dl// Dataset//rachel.jpg ') width,height = img.sizeimg = Numpy.asarray (img, dtype = ' float32 ')/256. # (height, width, 3) # put image in 4D tensor of shape (1,3,height,width) Img_rgb = Img.swapaxes (0,2). Swapaxes (3,heig) # Ht,width) minibatch_img = Img_rgb.reshape (1,3,height,width) filtered_img = f (minibatch_img) # Plot original image and both Convoluted resultspylab.figure (1) pylab.subplot (1,3,1);p ylab.axis (' off ');p ylab.imshow (IMG) title (' Origin image ') Pylab.gray () Pylab.subplot (2,3,2); Pylab.axis ("Off") Pylab.imshow (filtered_img[0,0,:,:]) #0: Minibatch_index; 0:1-st filtertitle (' convolution 1 ') Pylab.subplot (2,3,3); Pylab.axis ("Off") Pylab.imshow (filtered_img[0,1,:,:]) #0: Minibatch_Index 1:1-st filtertitle (' convolution 2 ') #pylab. Show () # maxpoolingfrom theano.tensor.signal Import downsampleinput = T.tensor4 (' input ') Maxpool_shape = (2,2) pooled_img = downsample.max_pool_2d (Input,maxpool_shape,ignore_border = False ) Maxpool = theano.function (inputs = [input], outputs = [pooled_img]) pooled_res = Numpy.squeeze (ma Xpool (filtered_img)) #pylab. Figure (2) Pylab.subplot (235);p Ylab.axis (' off ');p ylab.imshow (Pooled_res[0,:,:]) titl E (' down sampled 1 ') Pylab.subplot (236);p Ylab.axis (' off ');p ylab.imshow (pooled_res[1,:,:]) The title (' Down sampled 2 ') Pylab.show ()

3. CNN Structure

Presumably everyone Google under the CNN map are indiscriminate Street, here dragged out when learning CNN when a picture, self-think accompany on the interpretation of the picture is also easy to understand (<!--)

No more nonsense, directly on the LENET structure chart: (from the bottom down to the arrow, the bottom is the underlying original input)

4. CNN Code

Go to the resources to download it, I put it up oh ~ (in Python)

join the discussion and follow this blog and Weibo Rachel____zhang, follow-up content continues to update Oh ~

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More