Deep Learning Model: CNN convolution neural Network (i) depth analysis CNN

Source: Internet
Author: User

http://m.blog.csdn.net/blog/wu010555688/24487301

This article has compiled a number of online Daniel's blog, detailed explanation of CNN's basic structure and core ideas, welcome to exchange.

[1] Deep Learning Introduction

[2] Deep Learning training Process

[3] Deep learning Model: the derivation and implementation of CNN convolution neural network

[4] Deep learning Model: the reverse derivation and practice of CNN

[5] Deep Learning Model: CNN convolution neural Network (i) depth analysis CNN

[6] Deep Learning Model: CNN convolutional neural Network (ii) Word recognition system LeNet-5

[7] Deep Learning Model: CNN convolutional neural Network (iii) CNN FAQ's Summary

1. Overview

convolutional Neural Network is a special deep-seated neural network model, its particularity is embodied in two aspects, on the one hand, the connection between its neurons is non-fully connected, on the other hand, the weights of the connections between some neurons in the same layer are shared (i.e. the same). Its non-full-connection and weighted-value-sharing network structure makes it more similar to a biological neural network, reducing the complexity of the network model (which is important for deep structures that are difficult to learn) and reducing the number of weights.

Think back to the BP neural network. BP network each layer node is a linear one-dimensional arrangement state, and the layer is fully connected to the network nodes of the layer. This assumes that if the node connection between the middle and layer of the BP network is no longer fully connected, it is locally connected. This is the simplest one-dimensional convolutional network. If we extend this idea to two dimensions, this is the convolutional neural network we see in most of the resources. See details:

Left: Fully connected network. If we have an image of 1000x1000 pixels, there are 1 million hidden layer neurons, each of which is connected to each pixel of the image, there are 1000x1000x1000000=10^12 connections, that is, 10^12 weight parameters.

Right: The local connection network, each node and the upper node with the Location Attachment 10x10 window connected, then 1 million hidden layer neurons are only 100w times 100, that is, 10^8 parameters. The number of weight connections is reduced by four orders of magnitude compared to the original value.

We can easily calculate the output of a network node according to the forward transfer process of BP network signal. For example, for a net input that is labeled as a red node, it is equal to the sum of the product of the weight of the previous neuron node value and the red line that are connected to the red wire. This process of calculation, many books are called convolution.

In fact, for digital filtering, the coefficients of their filters are usually symmetrical. Otherwise, the convolution calculation needs to be reversed in half, then multiply and accumulate. Does the above neural network weights satisfy symmetry? I think the answer is no! Therefore, the above-mentioned is a convolution operation, obviously biased. But it doesn't matter, it's just a noun title. Just, the signal processing people, in the first contact with convolutional neural network, brought some misunderstanding of understanding.

convolutional neural network Another feature is weight sharing. For example, on the right-hand side of the graph, the weights are shared, which means that all red line labels have the same connection weights. This makes it easy for beginners to misunderstand.

Described above is only a single-layer network structure, the former a&t Shannon Lab Yann LeCun and other people based on the convolutional neural network of a word recognition system LeNet-5. The system was used in the 90 's to identify bank handwritten numerals.

2. The structure of CNN

Convolutional network is a kind of multilayer perceptron which is specially designed for recognizing two-dimensional shape, which is highly invariant to the transformation of translation, scale, tilt or co-form. These good performance is the network in supervised mode, the structure of the network has a sparse connection and weight sharing two features, including the following forms of constraints:
1, feature extraction. Each neuron receives a synapse from the local receptive domain of the upper layer, forcing it to extract local features. Once a feature is extracted, its exact position becomes less important as long as its position relative to other features is preserved approximately.
2, feature mapping. Each computing layer of a network is composed of multiple feature mappings, each of which is in planar form. The individual neurons in the plane share the same synaptic value set under the constraint, which has the following beneficial effects: A. Translation invariance. B. Reduction in the number of free parameters (implemented through weight sharing).
3, sub-sampling. Each convolution layer is followed by a computational layer that implements local averaging and sub-sampling, and the resolution of the feature map is reduced. This operation has the effect of reducing the sensitivity of the output of the feature map to translation and other forms of deformation.

convolutional Neural Network is a multilayer neural network, each layer is composed of several two-dimensional planes, and each plane consists of several independent neurons.

Figure: Convolution neural Network concept demonstration: the input image through and three can be trained filter and can be offset to the convolution, after convolution in the C1 layer generated three feature map, then the feature map in each group of four pixels to sum, weighted value, offset, A feature map of three S2 layers is obtained through a sigmoid function. These maps are then filtered to get the C3 layer. This hierarchy produces S4 as well as S2. Eventually, these pixel values are rasterized and connected into a vector input to the traditional neural network, resulting in output.

Generally, the C layer is a feature extraction layer, each neuron's input is connected to the local sensation field in the previous layer, and the local characteristics are extracted, and once the local feature is extracted, the position relationship between it and other features is determined; s layer is the feature map layer, and each computing layer of the network is composed of multiple feature mappings. Each feature is mapped to a plane, and the weights of all neurons on the plane are equal. The feature mapping structure uses the sigmoid function which affects the function core as the activation function of convolutional network, which makes the feature map have displacement invariance.

In addition, due to the sharing weights of neurons on a mapped surface, the number of free parameters is reduced and the complexity of network parameter selection is reduced. Each feature extraction layer (c-layer) in convolutional neural network is followed by a computing layer (S-layer) for local averaging and two extraction, and this unique two-time feature extraction structure makes the network more tolerant to the input sample when it is recognized.

2.1 Sparse Connection (Sparse Connectivity)

Convolutional networks use the spatial local characteristics of an image by forcing the local connection pattern between two adjacent layers, and the hidden layer element in the second layer is connected only to the local area of the input unit of the M-1 layer, and these local areas of the m-1 layer are called space continuous accepted domains. We can describe this structure as follows:
The m-1 layer is the retina input layer, the width of the accepted domain of the first m layer is 3, that is, each cell of the layer is connected with only 3 adjacent neurons of the input layer, and the first m layer and the m+1 layer have similar link rules, as shown in.

You can see that the neurons in the m+1 layer have a width of 3 relative to the accepted domain of the first m layer, but the accepted domain is 5 relative to the input layer, and this structure will learn that the filter (corresponding to the unit that is the most active in the input signal) is limited to the local space mode (because each cell does not react to the variation it accepts It can also be seen that several of these layers are stacked to make the filter (which is no longer linear) gradually becoming global (that is, covering a larger visual area). For example, the neurons in the m+1 layer can have a non-linear feature code for the input with a width of 5.

2.2 Weight sharing (shared Weights)

In convolutional networks, each sparse filter hI overrides the entire visual domain through shared weights, and the units of these shared weights form a feature map, as shown in.

In the figure, there are 3 hidden layer elements, which belong to the same feature map. The weights of the same color link are the same, we can still use the gradient descent method to learn these weights, only need to make some small changes to the original algorithm, where the gradient of shared weights is the sum of the gradients of all shared parameters. We can not help asking why weight sharing? On the one hand, a repeating unit can recognize a feature regardless of its position in the visual domain. On the other hand, weight sharing allows us to perform feature extraction more efficiently because it greatly reduces the number of free variables that need to be learned. By controlling the scale of the model, convolutional network can have a good generalization ability to the visual problem.

For example:

It's like CNN's a nice place to be. by feeling wild and weight sharing, you reduce the number of parameters that a neural network needs to train. What the hell is that?

Left: If we have an image of 1000x1000 pixels, there are 1 million hidden neurons, then they are all connected (each hidden layer neuron is connected to each pixel of the image), there is a 1000x1000x1000000=10^12 connection, that is, 10^12 weight parameters. However, the spatial connection of the image is local, just like a person through a local feeling of the field to feel the external image, each neuron does not need to feel the global image, each neuron only feel the local image area, and then in the higher level, these feelings of different local neurons can be obtained the overall information. In this way, we can reduce the number of connections, that is, to reduce the number of weight parameters that neural networks need to train. such as right: if the local feeling field is 10x10, the hidden layer of each feeling field only need and this 10x10 local image connection, so 1 million hidden layer neurons have only 100 million connections, that is, 10^8 parameters. Four less than the original 0 (order of magnitude), so the training is not so laborious, but still feel a lot of ah, there is no way to do?

We know that each neuron in the hidden layer is connected to the 10x10 image area, which means that each neuron has a 10x10=100 connection weight parameter. What if the 100 parameters of each of our neurons are the same? This means that each neuron uses the same convolution kernel to deconvolution the image. So we only have how many parameters?? Only 100 parameters Ah!!! Kiss! Regardless of the number of neurons in your hidden layer, I only have 100 parameters for the connection between the two layers. Kiss! This is the weight sharing Ah! Kiss! This is the main selling point of convolutional neural network Ah! Kiss! (a bit annoying, hehe) You may ask, is this a reliable way to do it? Why is it possible? This one...... Learn together.

Well, you will think, so the extraction of features is not reliable, so you only extracted a feature ah? Yes, it's smart, we need to extract a lot of features, right? If a filter, or convolution kernel, is a feature of the proposed image, such as the edge of a certain direction. So we need to extract the different characteristics, how to do, add a few more filters will not be OK? That's right. So suppose we add 100 filters, each of which has different parameters, indicating that it presents the various features of the input image, such as different edges. So each filter goes to the convolution image to get a different feature of the image show, which we call feature Map. So there are 100 feature maps of 100 convolution cores. These 100 feature maps form a single layer of neurons. It's clear by this time. How many parameters do we have on this floor? 100 convolution cores x each convolution core shares 100 parameter =100x100=10k, which is 10,000 parameters. Only 10,000 parameters Ah! Kiss! (Come again, can't stand it!) See right: Different colors to express different filters.

Hey yo, missing a question. It is said that the number of the hidden layer parameter is independent of the number of neurons in the hidden layer, which is only related to the size of the filter and the type of filter. So how do we determine the number of neurons in the hidden layer? It is related to the original image, that is, the size of the input (number of neurons), the size of the filter and the sliding step length of the filter in the image! For example, my image is 1000x1000 pixels, and the filter size is 10x10, assuming that the filter does not overlap, that is, the step is 10, so that the number of neurons in the hidden layer is (1000x1000)/(10x10) =100x100 neurons, assuming the step is 8, That is, the convolution core overlaps two pixels, so ... I will not forget, the idea of understanding is good. Note that this is just a filter, that is, the number of neurons in a feature map Oh, if 100 feature map is 100 times times. Thus, the larger the image, the number of neurons and the number of weights needed to train the gap between the rich and poor is greater.

It is important to note that the above discussion does not take into account the biased parts of each neuron. So the number of weights needs to be added 1. This is also shared with the same filter.

In short, the core idea of convolutional networks is to combine the three structural ideas of local sensation field, weighted value sharing (or weight reproduction) and time or spatial sub-sampling to obtain some degree of displacement, scale and deformation invariance.

2.3 The full Model

convolutional Neural Network is a multilayer neural network, each layer is composed of several two-dimensional planes, and each plane consists of several independent neurons. The network contains some simple elements and complex elements, which are recorded as S-and C-elements, respectively. S-element aggregation together form S-plane, S-plane aggregation together to form S-layer, expressed in us. There is a similar relationship between the C-Meta, C-plane, and C-layer (Us). Any intermediate level of the network is made up of S-layer and C-layer, and the input stage contains only one layer, it directly accepts two-dimensional visual mode, and the sample feature extraction step is embedded in the interconnected structure of convolutional neural network model.

Generally, US is a feature extraction layer (sub-sampling layer), each neuron input and the first layer of the local sensing field, and extract the local characteristics, once the local feature is extracted, it and other characteristics of the location of the relationship between it is also determined;

UC is a feature map layer (convolutional layer), each computing layer of the network consists of multiple feature mappings, each of which is mapped to a plane, and the weights of all neurons in the plane are equal. The feature mapping structure uses the sigmoid function which affects the function core as the activation function of convolutional network, which makes the feature map have displacement invariance. In addition, due to the sharing weights of neurons on a mapped surface, the number of free parameters is reduced and the complexity of network parameter selection is reduced. Each feature extraction layer (S-layer) in convolutional neural network is followed by a computing layer (c-layer) for local averaging and two extraction, and this unique two-time feature extraction structure makes the network more tolerant to the input sample when it is recognized.

is an example of convolutional networks, in the blog "Deep Learning Model: CNN convolutional neural Network (ii) Word recognition system LeNet-5" in detail:



The convolutional network workflow is as follows, and the input layer consists of a 32x32 node that receives the original image. The calculation process then alternates between convolution and sub-sampling, as described below:

The first hidden layer is convolution, which consists of 8 feature mappings, each of which consists of 28x28 neurons, each specifying a 5x5 accepted domain;

The second hidden layer implements sub-sampling and local averaging, which is also composed of 8 feature mappings, but each feature map consists of 14x14 neurons. Each neuron has a 2x2 accepted domain, a training factor, a training bias, and a sigmoid activation function. The operating point of the neuron can be controlled by the training factor and bias.

The third hidden layer is a second convolution, which consists of 20 feature mappings, each of which consists of 10x10 neurons. Each neuron in the hidden layer may have a synaptic connection that is connected to a few feature mappings of the next hidden layer, which operates in a similar manner to the first convolution layer.

The fourth hidden layer is a second sub-sampling and a local average juice calculation. It consists of 20 feature mappings, but each feature map consists of 5x5 neurons, which operate in a similar way to the first sample.

The fifth hidden layer implements the final stage of the convolution, which consists of 120 neurons, each of which specifies a 5x5 accepted domain.

Finally, an all-connected layer, which gets the output vector.

Successive calculation layers in the continuous alternating between convolution and sampling, we get a "double spire" effect, that is, in each convolution or sampling layer, with the spatial resolution decreases, compared with the corresponding previous layer of the number of feature maps increased. The idea of sub-sampling after convolution is inspired by the idea of "simple" cells in the animal vision system followed by "complex" cells.

The multilayer perceptron shown in the figure contains approximately 100,000 synaptic connections, but only about 2,600 free parameters (each feature is mapped to a plane, and the weights of all neurons on the plane are equal). The significant decrease in the number of free parameters is obtained by sharing weights, and the ability of learning machines (measured in the form of VC dimensions) decreases, which improves its generalization ability. And its adjustment to the free parameters is realized by the random form of the inverse propagation learning. Another notable feature is that the use of value sharing makes it possible to implement convolutional networks in parallel. This is another advantage of convolutional networks for fully connected multilayer perceptron.

3. CNN Training

The mainstream of neural network for pattern recognition is guided learning network, and no Guidance Learning Network is used for clustering analysis. For guided pattern recognition, because the class of any sample is known, the distribution of the sample in space is no longer based on its natural distribution tendency, but rather to find an appropriate spatial partitioning method based on the spatial distribution of homogeneous samples and the degree of separation between different classes of samples, or to find a classification boundary, So that different classes of samples are located in different areas. This requires a lengthy and complex learning process that continuously adjusts the location of the classification boundaries used to divide the sample space so that as few samples as possible are divided into non-homogeneous areas.

Convolutional networks are essentially input-to-output mappings that can learn a large amount of mapping between input and output, without the need for precise mathematical expressions between inputs and outputs, as long as the network is trained with a known pattern for convolutional networks, which has the ability to map between input and output pairs. The Convolutional network performs a mentor training, so its sample set consists of a vector pair of shapes such as: (input vector, ideal output vector). All of these vectors should be the actual "running" result of the system that the network is about to emulate. They can be collected from the actual operating system. Before starting the training, all weights should be initialized with a few different small random numbers. The "small random number" is used to ensure that the network does not enter saturation due to excessive weights, resulting in training failure; "Different" is used to ensure that the network can learn normally. In fact, if you use the same number to initialize the weight matrix, the network is incapable of learning.

The training algorithm is similar to the traditional BP algorithm. It consists of 4 steps, and these 4 steps are divided into two stages:

The first stage, the forward propagation phase:

A) Take a sample (X,YP) from the sample set and input X into the network;

b) Calculate the corresponding actual output op.

At this stage, the information is transferred from the input layer to the output layer through a gradual transformation. This process is also the process that the network executes when it is running properly after the training is completed. In this process, the network performs a calculation (in effect, the input is multiplied by the weight matrix of each layer, resulting in the final output):

OP=FN (... (F2 (F1 (XpW (1)) W (2)) ... ) W (n))

Second stage, backward propagation phase

A) calculates the difference between the actual output op and the corresponding ideal output YP;

b) The inverse propagation of the adjustment weight matrix by minimizing the error.

4, CNN's study

In general, convolutional networks can be simplified to the model shown:

Among them, input to C1, S4 to C5, C5 to output is full connection, C1 to S2, C3 to S4 is one by one corresponding connection, S2 to C3 in order to eliminate the network symmetry, removed a part of the connection, you can make the feature map more diverse. It is important to note that the size of the C5 convolution core is the same as the output of the S4, so that the output is a one-dimensional vector.

4.1 Study of convolutional layers

The typical structure of the convolution layer is as follows:


The feedforward operation of the convolution layer
is implemented by the following algorithms:

The output of the convolution layer = Sigmoid (Sum (convolution) + offset)

The convolution cores and offsets are both available for training. Here's the core code:

Convolutionlayer::fprop (input,output) {  //Gets the number of convolution cores  int n=kernel. Getdim (0);  for (int i=0;i<n;i++) {      //I convolution check should input layer a feature map, the output layer of the B feature map      //This convolution kernel can be viewed as a link from the input layer a feature mapping to the B feature mapping of the output layer      int a=table[i][0], b=table[i][1];      Convolution      convolution = Conv (Input[a],kernel[i]) with the first convolution kernel and the input layer, a feature map;      Sum the convolution result      sum[b] +=convolution;  }  for (i=0;i< (int) bias.size (); i++) {      //Plus offset      sum[i] + = Bias[i];  }  Call the Sigmoid function  output = Sigmoid (sum);}

Where input is the n1xn2xn3 matrix, N1 is the number of input layer feature mappings, N2 is the width of the input layer feature map, N3 is the height of the input layer feature map. Output, sum, convolution, bias is the matrix of n1x (n2-kw+1) x (n3-kh+1), and Kw,kh is the width height of the convolution core (5x5 in the figure). Kernel is a convolution kernel matrix. Table is a connection table, that is, if there is a connection between the input and the B output, the table will have a [a, a], and each connection corresponds to a convolution core.

The core code of the convolution layer's feedback operation is as follows:

Convolutionlayer::bprop (INPUT,OUTPUT,IN_DX,OUT_DX) {  //gradient passed dsigmoid  sum_dx = dsigmoid (OUT_DX);  Calculates the gradient for the bias for  (I=0;i<bias.size (); i++)  {      Bias_dx[i] = Sum_dx[i];  }  Get the number of convolution cores  int n=kernel. Getdim (0);  for (int i=0;i<n;i++)  {      int a=table[i][0],b=table[i][1];      Using the I convolution core and the B output layer reverse convolution (that is, the output layer of a bit of the convolution template is returned to the input layer), and the results are accumulated to the first input layer      input_dx[a] + = Dconv (Sum_dx[b],kernel[i]);      Calculate the gradient of the convolution template in the same way      kernel_dx[i] + = Dconv (Sum_dx[b],input[a]);}  }

The structure of the IN_DX,OUT_DX is the same as the Input,output, which represents the gradient of the corresponding point.

4.2 Sub-sampling layer learning

The typical structure of the sub-sampling layer is as follows:

similar to Child The output of the sample layer is calculated as:

output = Sigmoid (Sample * weight + offset)

The core code is as follows:

Subsamplinglayer::fprop (input,output) {  int n1= input. Getdim (0);  int n2= input. Getdim (1);  int n3= input. Getdim (2);  for (int i=0;i<n1;i++) {      for (int. j=0;j<n2;j++) {          for (int k=0;k<n3;k++) {              //coeff is a trained weight, SW, SH is sampled The size of the window.              sub[i][j/sw][k/sh] + = Input[i][j][k]*coeff[i]          ;  }} for (i=0;i<n1;i++) {      //Plus offset      sum[i] = Sub[i] + bias[i];  }  output = Sigmoid (sum);}

The core code for the feedback operation of the sub-sampling layer is as follows:

Subsamplinglayer::bprop (INPUT,OUTPUT,IN_DX,OUT_DX) {  //gradient passed dsigmoid  sum_dx = dsigmoid (OUT_DX);  Calculates the gradient for bias and Coeff for  (i=0;i<n1;i++) {      coeff_dx[i] = 0;      Bias_dx[i] = 0;      for (j=0;j<n2/sw;j++) for          (k=0;k<n3/sh;k++) {              Coeff_dx[i] + = sub[j][k]*sum_dx[i][j][k];              Bias_dx[i] + = Sum_dx[i][j][k]);}  }  for (i=0;i<n1;i++) {to      (j=0;j<n2;j++) for (          k=0;k<n3;k++) {              In_dx[i][j][k] = COEFF[I]*SUM_DX I [J/SW] [K/sh];}}  

5. The advantages of CNN

Convolutional Neural Networks CNN is mainly used to identify two-dimensional graphs of displacement, scaling and other forms of distorted invariance. Because CNN's feature detection layer learns through training data, it avoids explicit feature extraction and implicitly learns from training data when using CNN, and because the weights of neurons on the same feature map face are the same, the network can learn in parallel, This is also a major advantage of convolutional networks over the network of neurons connected to each other. Convolution neural network has unique superiority in speech recognition and image processing because of its local weight sharing special structure, its layout is closer to the actual biological neural network, weight sharing reduces the complexity of the network, In particular, the image of multidimensional input vectors can be directly input to the network, which avoids the complexity of data reconstruction during feature extraction and classification.

The classification of streams is almost always based on statistical features, which means that certain features must be extracted before they can be resolved. However, explicit feature extraction is not easy and is not always reliable in some application issues. convolutional neural networks, which avoids explicit feature sampling and implicitly learns from the training data. This makes the convolution neural network obviously different from other neural network classifier, and the feature extraction function is fused into multilayer perceptron through structure recombination and weight reduction. It can directly handle grayscale images and can be used directly to process image-based classification.

The convolution network has the following advantages in image processing compared with the general Neural Network: a) The topological structure of the input image and the network can match well; b) feature extraction and pattern classification are carried out simultaneously and in training; c) weight sharing can reduce the training parameters of network, make the structure of neural network simpler and more adaptable.

6. The implementation of CNN

The close relationship between these layers and spatial information in CNNs makes them suitable for image processing and comprehension. Moreover, it also shows a better performance in extracting the salient features of the image automatically. In some cases, the Gabor filter has been used in an initial pre-processing step to simulate the response of the human visual system to visual stimuli. In most of the current work, researchers have applied cnns to a variety of machine learning problems, including face recognition, document analysis, and language detection. To achieve the purpose of finding coherence between frames and frames in a video, CNNs is currently trained through a temporal coherence, but this is not cnns specific.

Because the convolution neural network uses the same algorithm as BP network. Therefore, using the existing BP network can be achieved. Open source Neural Network code Faan can be exploited.      This open source implementation uses a number of code optimization techniques, with double precision, single-precision, fixed-point operation three different versions. Because the classical BP network is a one-dimensional node distribution arrangement, convolution neural network is a two-dimensional network structure. So, in order to map each layer of convolutional neural network into one-dimensional node distribution according to a certain sequence and rules, then, according to this distribution, we can create a network structure of multi-layer inverse propagation algorithm, and then learn the network parameters according to the general BP training algorithm. For the prediction of new samples in real environment, the same signal forward transfer algorithm is used in BP algorithm. Specific details can also refer to an open source code on the Web, the link is as follows: http://www.codeproject.com/Articles/16650/ Neural-network-for-recognition-of-handwritten-digi Note: This code has an obvious bug when creating a CNN, and if you see the structure of the simplified LeNet-5 I'm facing, A glance will find out where the problem lies. 7. Some puzzles  This article explains the CNN build process, slightly more detailed, but there are a few points I don't understand: <1> What is the size of the local sliding window? <2> How to decide the position of the sliding window? <3> what does the position of the sliding window determine? C-Layer size? <4> "based on the same understanding of the front C1 layer, it is easy to get the size of the C3 layer to 10x10. "Don't understand how to get <5>" C3 layer became 16 10x10 network! " How did you become a 16? <6> the third part is about simplifying the LeNet-5 system, which completely does not understand the calculation of the node and the weighted value. I hope to meet the more proficient CNN friends A little explanation, thank you!  

References:

convolutional neural Network (CNN)

GitHub volume and Neural Network Code (MATLAB)

CNN Code Understanding (MATLAB)

http://blog.csdn.net/nan355655600/article/details/17690029

http://blog.csdn.net/zouxy09/article/details/8782018

Deep Learning Model: CNN convolution neural Network (i) depth analysis CNN

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.