Research progress of "neural network and deep learning" generative anti-network gan (Fri)--deep convolutional generative adversarial Nerworks,dcgan

Source: Internet
Author: User
Tags arithmetic extend generator object object generative adversarial networks

Preface  
     This article first introduces the build model, and then focuses on the generation of the generative Models in the build-up model (generative Adversarial Network) research and development. According to Gan main thesis, gan applied paper and gan related papers, the author sorted out 45 papers in recent two years, focused on combing the links and differences between the main papers, and revealing the research context of the generative antagonism network.  
The papers covered in this article are: Goodfellow Ian, Pouget-abadie J, Mirza M, et al. generative adversarial nets[c]//advances in neural in Formation processing Systems. 2014:2672-2680. Denton E L, Chintala S, Fergus R. Deep generative Image Models using a Laplacian Pyramid of adversarial Networks[c]//advan Ces in neural information processing systems. 2015:1486-1494. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial network S[J]. ARXIV preprint arxiv:1511.06434, 2015. 5. Deep convolution generation anti-network, Deepin convolutional generative adversarial nerworks 5.1 dcgan Ideas

DCGAN[1] This paper does not seem to be a great innovation, but in fact, its open source code is now used and the most frequent reference. All this must be attributed to the work of Lapgan [2] More robust of engineering experience to share. That is, Dcgan,deep convolutional generative adversarial Networks, the work [1], points out many of the important architectural designs for this unstable learning approach to GAN and the specific experience of this network for CNN. The key point of view:
For example, they suggest that since the strided convolutional networks, which has been proposed before, can theoretically achieve the same functions and effects as CNN with pooling, then strided convolutional networks as a Fully differentiable generator G, in Gan will show more controllable and stable.
Another example is that Facebook's Lapgan that the batch normalization (BN) is used on D in GAN can lead to a collapse of learning, but the Dcgan succeeds in using bn on G and D. These engineering breakthroughs are undoubtedly important reasons why more people choose to dcgan this work as base.
On the other hand, they also have many contributions in visualize generative models. For example, they have studied ICLR 2016 paper "Generating sentences from a continuous space" in the way of interpolate space, will generate the image of hidden states show out , you can see the gradual evolution of the image process.
At the same time, they also spoke of Vector arithmetic used in the image, and obtained some of the following results:
5.2 unsupervised R epresentation L earning with D EEP C onvolutional generative A dversarial N etworks 5.2.1 Introduction

CNN's progress in unsupervised learning is slow compared with supervised learning. Based on the successful and unsupervised learning of CNN in supervised learning, this paper proposes a class of "deep convolution generation countermeasure network (Dcgans)", which uses generation model and discriminant model, from Object object to scene image to learn a level of characterization. Finally, use the learned features to implement new tasks--to illustrate the characterization that they can use to generate images.
Unsupervised learning and characterization for supervised learning.

By using Gan to construct the characterization, then reuse the partial generation model and discriminant model as the feature extractor with supervised learning.

Gan is an attractive alternative to the "maximum likelihood" method.

For characterization learning, there is no need for heuristic loss functions to be attractive.

The

    gan has a common problem: the unstable of the training process: it often leads to output that is meaningless by the generator. There is a limited amount of research currently being made to understand and visualize what Gans learned and the middle tier standards of multilayer Gans.  
     This article main contribution: we proposed and evaluated a series of convolution Gans in the structural topology constraints, so that it more stable. We named it deep convolution-generated anti-convolutional gans the use of well-trained discriminant models for image classification, and the results of other unsupervised methods are comparable. Visualization of convolution kernel generation model with vector computing performance 5.2.2 related work

Representation Learning from unlabeled Data
Unsupervised characterization learning is a fairly good research issue in the field of CV
Classical unsupervised characterization learning methods: clustering analysis; Improving classification performance by clustering clusters
In the context of images does hierarchical clustering of image patches (Coates & Ng,) to learn powerful image repre Sentations. Train auto-encoders (convolutionally, stacked (Vincent et al.), separating the and where components of the code (Zhao et al.), Ladder structures (Rasmus et al.)) that encode an image into a compact code, and decode the CO De to reconstruct the image as accurately as possible. Deep belief networks (Lee et al.) has also been shown to work well in learning hierarchical representations.

Generating natural images
Parameter generation model and non-parametric generation model
Non-parametric Models
The non-parametric models often does matching from a database of existing images, often matching patches of images.
Parametric models
A variational sampling approach to generating images (Kingma & Welling, 2013)
Another approach generates images using an iterative forward diffusion process (Sohl-dickstein et al., 2015)
Generative adversarial Networks (Goodfellow et al) generated images suffering from being noisy and incomprehensible .
A Laplacian Pyramid Extension to this approach (Denton et al.) showed higher quality images, but they still suffered From the objects looking wobbly because of noise introduced in chaining multiple models.
A recurrent network approach (Gregor et al.,) and a deconvolution network approach (Dosovitskiy et al,) have AL So recently had some success with generating natural images, not leveraged the generators for supervised tasks5.2.3 Dcgan Network model

Historically, the use of CNN to extend the Gans model was not very successful (what is the meaning of "extension" here?). Both the original Gan and Lapgan are useful convolutional networks as model/discriminant models. )
This drives lapgan[2] authors to develop an alternative approach: iteratively upgrading low-resolution images
Attempting to scale Gans using CNN architectures commonly used in the supervised literature
Attempts to extend Gans using the CNN architecture mentioned in the literature that are commonly used for supervised learning have encountered difficulties. Finally, a class of structures is found that can be trained stably on a variety of datasets and produce higher resolution images: deep convolutional generation Networks (Dcgan).
Core to our approach are adopting and modifying three recently demonstrated changes to CNN architectures.
The core of the approach: the adoption and modification of three recent improvements to the CNN structure:

All convolutional net (Springenberg et al., 2014) full convolution network discriminant model: Spatial pooling (spatial pooling), which is replaced by a convolution with step size (strided convolutions) Allow the network to learn their own space under sampling (spatial downsampling). Build model: Use a micro-stride convolution (fractional strided) to allow it to learn its own spatial sampling (spatial upsampling).

Eliminate full-join layer e.g on top of convolution features. Global average pooling which has been utilized in state of the
Art image classification Models (Mordvintsev et al.). Global average pooling contributes to the stability of the model, but damages the convergence rate
Input: Obey the uniformly distributed noise vector, 100 dimension;
Output: And output an RGB image of 64x64x3. Activation function:
Generate the Model: The output layer uses the Tanh function, and the other layers use the Relu activation function.
Discriminant model: Use Leakyrelu for all layers

The

Batch normalization batch standardization  
     solve the training problems caused by bad initialization, so that the gradient can propagate deeper.  
    batch Normalization proves the importance of generating model initialization, avoiding the generation of model crashes: All samples generated are at one point (same sample), This is the training Gans often encounter failure phenomenon.  
    this proved critical to get deep generators to begin learning, preventing the generator From collapsing all samples to a single point which are a common failure mode observed in gans. 

   The noise of   100 is projected into a convolution representation of a small spatial amplitude. There are four micro-stride convolution (in some papers, they are mistakenly called deconvolution deconvolutions) and then convert these high-level representations to a 64 * 64-pixel RGB three-channel picture. There is no fully connected layer, no pooling layer.  
The original text of Dcgan's network structure is not very clear, Semantic Image inpainting with perceptual and 
Contextual Losses This article uses the Dcgan for image repair, the network structure and parameter introduction of the comparison is clear (the figure of the network d of the various layers of convolution operation should be the same as the generation of network G, but the figure is different, it is suspected that the network D is not the number of channels (convolution core number) painted wrong. )。 As shown in the following image:  
    

The above figure A is the generation model G: input (100 d noise vector z) to the first layer: fully connected 100-> 1024, then 1024 one-dimensional vector reshape into 1024 channels of 4*4 feature map. The basic rule is that each of the next layers of the generated network is a deconvolution layer, with the number of channels halved and the image size doubled.
The following figure B discriminant model D: is a full convolutional network without pooling, the output is a scalar, indicating that the input data belongs to the training data rather than the probability of generating a sample. 5.3 Experiments

By training Dcgan on the Lsun bedroom dataset, the resulting image is very lifelike:

We demonstrate a unsupervised Dcgan trained on a large image dataset can also learn a hierarchy of features Interesting.
Using guided BackPropagation as proposed by (Springenberg et al.,), we show in fig.5 so the features learnt by the Discriminator Activate on typical parts of a bedroom, like beds and windows.

Vector Arithmetic for Visual concepts
Reference

[1] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial NETWORKS[J]. ArXiv preprint arxiv:1511.06434, 2015. 
[2] Denton E L, Chintala S, Fergus R. Deep generative Image Models using A Laplacian Pyramid of adversarial networks[c]//advances in neural information processing systems. 2015:1486-1494.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.