(turn) How to Train a GAN? Tips and tricks to make Gans

Source: Internet
Author: User
Tags generative adversarial networks

How to Train a GAN? Tips and tricks to make Gans

Transferred from: Https://github.com/soumith/ganhacks

While at generative adversarial Networks (Gans) continues to improve the fundamental stability of these models, W E use a bunch of tricks to train them and make them stable day to day.

Here is a summary of some of the tricks.

Here's a link to the authors of this document

If you find a trick that's particularly useful in practice, please open a pull Request to add it to the document. If we find it to be reasonable and verified, we'll merge it in.

1. Normalize the Inputs
    • Normalize the images between-1 and 1
    • Tanh as the last layer of the generator output
2: A modified loss function

In GAN papers, the loss function to optimize G min (log 1-D) are, but in practice folks practically usemax log D

    • Because the first formulation have vanishing gradients early on
    • Goodfellow et. Al (2014)

In practice, works well:

    • Flip labels when training generator:real = fake, fake = Real
3:use a spherical Z
    • Dont sample from a Uniform distribution

    • Sample from a Gaussian distribution

    • When doing interpolations, does the interpolation via a great circle, rather than a straight line from point A to point B
    • Tom White ' s sampling generative Networks have more details
4:batchnorm
    • Construct different mini-batches for real and fake, i.e. each mini-batch needs to contain only all real images or all gene Rated images.
    • When batchnorm are not a option use instance normalization (for each sample, subtract mean and divide by standard Deviatio N).

5:avoid Sparse Gradients:relu, Maxpool
    • The stability of the GAN game suffers if you have sparse gradients
    • Leakyrelu = Good (in both G and D)
    • For downsampling, Use:average Pooling, conv2d + Stride
    • For Upsampling, Use:pixelshuffle, convtranspose2d + Stride
      • pixelshuffle:https://arxiv.org/abs/1609.05158
6:use Soft and Noisy Labels
    • Label smoothing, i.e. if you have both target labels:real=1 and fake=0, then for each incoming sample, if it was Real, then Replace the label with a random number between 0.7 and 1.2, and if it's a fake sample, replace it with 0.0 and 0.3 (for Example).
      • Salimans et. Al. 2016
    • Make the labels the noisy for the discriminator:occasionally flip the labels when training the discriminator
7:dcgan/hybrid Models
    • Use Dcgan when can. It works!
    • If you cant use Dcgans and no model are stable, use a hybrid model:kl + gan or VAE + gan
8:use stability tricks from RL
    • Experience Replay
      • Keep A replay buffer of past generations and occassionally show them
      • Keep checkpoints from the past of G and D and occassionaly swap them off for a few iterations
    • All stability tricks this work for deep deterministic policy gradients
    • See Pfau & Vinyals (2016)
9:use the ADAM Optimizer
    • Optim. Adam rules!
      • See Radford et. Al. 2015
    • Use of SGD for discriminator and ADAM for generator
10:track Failures Early
    • D loss goes to 0:failure mode
    • Check norms of gradients:if they is over things is screwing up
    • When things is working, D loss have low variance and goes down over time vs have huge variance and spiking
    • If loss of generator steadily decreases, then it's fooling D with garbage (says Martin)
11:dont balance loss via statistics (unless you had a good reason to)
    • Dont try to find a (number of g/number of D) schedule to Uncollapse training
    • It ' s hard and we've all tried it.
    • If You do try it, there is a principled approach to it, rather than intuition

For example

while lossD > A:  train Dwhile lossG > B:  train G
12:if You has labels, use them
    • If you had labels available, training the discriminator to also classify the samples:auxillary Gans
13:add noise to inputs, decay over time
    • Add some artificial noise to inputs to D (Arjovsky et al., Huszar, 2016)
      • http://www.inference.vc/instance-noise-a-trick-for-stabilising-gan-training/
      • Https://openreview.net/forum?id=Hk4_qw5xe
    • Adding Gaussian noise to every layer of generator (Zhao et al Ebgan)
      • Improved Gans:openai code also have it (commented out)
: [Notsure] Train discriminator More (sometimes)
    • Especially when you have noise
    • Hard to find a schedule of number of D iterations vs G iterations
: [Notsure] Batch discrimination
    • Mixed Results
16:discrete variables in Conditional Gans
    • Use an embedding layer
    • Add as additional channels to images
    • Keep embedding dimensionality Low and upsample to match image channel size
Authors
    • Soumith Chintala
    • Emily Denton
    • Martin Arjovsky
    • Michael Mathieu

(turn) How to Train a GAN? Tips and tricks to make Gans

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.