Deep Learning: convolutional neural networks and basic concepts of image recognition

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

the composition of a convolutional neural network

Image classification can be considered to be given a test picture as input Iϵrwxhxc Iϵrwxhxc, the output of this picture belongs to which category. The parameter W is the width of the image, H is the height, C is the number of channels, and C = 3 in the color image, and C = 1 in the grayscale image. The total number of categories will be set, for example in a total of 1000 categories in the Imagenet contest, and 10 in the CIFAR10. convolutional neural networks can be seen as such a black box. The input is the original picture I, the output is an L-dimensional vector vϵrl vϵrl. L indicates the number of pre-set categories. Each dimension of vector v represents the size of the probability that the image belongs to the corresponding category. If it is a single-category identification problem, that is, each image is assigned only one label in the L label, then the elements in V can be compared and the maximum value corresponding to the label as the result of the classification. V can be a form of probability distribution, that is, each element 0≤vi≤1 0≤vi≤1, and ∑ivi=1∑ivi=1. Where VI VI represents the first element of V. It can also be a real number from a negative infinity to a positive infinity, and the larger the greater the likelihood of belonging to the corresponding category. In the inner part of convolutional neural network, it is composed of many layers. Each layer can be considered a function, the input is the signal x, the output is the signal y=f (x) y=f (x). The output y can also be used as input to other layers. The following is a survey of the definitions of commonly used layers from the perspective of the front, middle, and end of the network. The front-end mainly consider the process of image processing, the middle end is a variety of neurons, the end of the main consideration of training network-related loss function. the previous segment of the two networks

The previous paragraph refers to the processing of image data, which can be called the data layer. 2.1 Data Cuts

The size of the image you enter may vary, with some images having a larger resolution and some smaller. And the aspect ratio is not necessarily the same. For such inconsistencies, in theory, it can be dismissed, but this requires other layers of the network to support such input. In most cases, the output image is a fixed resolution by clipping method. At the stage of network training, the cropped position is randomly selected from the original image, and only the sub-graph that satisfies the clipping can be completely dropped in the image. This is done randomly because the equivalent of adding additional data can alleviate the problem of overfitting. 2.2 Color Disturbances

After cropping the original image, each pixel is a fixed value of 0 to 255. Further processing, including subtracting the mean, as well as the proportional scaling pixel value, makes the division of pixel values between [−1, 1]. In addition to these regular operations, the image is normalized, which is equivalent to image enhancement, such as the data preprocessing of CIFAR10 in [9, 18, 17]. For example, for each pixel, randomly select one of the RGB three channels, and then randomly add a value from [ -20,20] on the basis of the original pixel value. the middle of three networks

The following is a definition of the layers commonly used in volumes and neural networks, that is, what dimension the data x is entered in, what dimension of the output y is, and how to get the output from the input. 3.1 Basic components of convolutional neural networks

The following figure:

3.2 convolution layer

The convolution layer input is represented as xϵrwx

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deep Learning: convolutional neural networks and basic concepts of image recognition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Deep Learning: convolutional neural networks and basic concepts of image recognition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support