(turn) How to Train a GAN? Tips and tricks to make Gans

Last Update:2016-12-11 Source: Internet

Author: User

Tags generative adversarial networks

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

How to Train a GAN? Tips and tricks to make Gans

Transferred from: Https://github.com/soumith/ganhacks

While at generative adversarial Networks (Gans) continues to improve the fundamental stability of these models, W E use a bunch of tricks to train them and make them stable day to day.

Here is a summary of some of the tricks.

Here's a link to the authors of this document

If you find a trick that's particularly useful in practice, please open a pull Request to add it to the document. If we find it to be reasonable and verified, we'll merge it in.

1. Normalize the Inputs

Normalize the images between-1 and 1
Tanh as the last layer of the generator output

2: A modified loss function

In GAN papers, the loss function to optimize G min (log 1-D) are, but in practice folks practically usemax log D

Because the first formulation have vanishing gradients early on
Goodfellow et. Al (2014)

In practice, works well:

Flip labels when training generator:real = fake, fake = Real

3:use a spherical Z

Dont sample from a Uniform distribution

Sample from a Gaussian distribution

When doing interpolations, does the interpolation via a great circle, rather than a straight line from point A to point B
Tom White ' s sampling generative Networks have more details

4:batchnorm

Construct different mini-batches for real and fake, i.e. each mini-batch needs to contain only all real images or all gene Rated images.
When batchnorm are not a option use instance normalization (for each sample, subtract mean and divide by standard Deviatio N).

5:avoid Sparse Gradients:relu, Maxpool

The stability of the GAN game suffers if you have sparse gradients
Leakyrelu = Good (in both G and D)
For downsampling, Use:average Pooling, conv2d + Stride
For Upsampling, Use:pixelshuffle, convtranspose2d + Stride
- pixelshuffle:https://arxiv.org/abs/1609.05158

6:use Soft and Noisy Labels

Label smoothing, i.e. if you have both target labels:real=1 and fake=0, then for each incoming sample, if it was Real, then Replace the label with a random number between 0.7 and 1.2, and if it's a fake sample, replace it with 0.0 and 0.3 (for Example).
- Salimans et. Al. 2016
Make the labels the noisy for the discriminator:occasionally flip the labels when training the discriminator

7:dcgan/hybrid Models

Use Dcgan when can. It works!
If you cant use Dcgans and no model are stable, use a hybrid model:kl + gan or VAE + gan

8:use stability tricks from RL

Experience Replay
- Keep A replay buffer of past generations and occassionally show them
- Keep checkpoints from the past of G and D and occassionaly swap them off for a few iterations
All stability tricks this work for deep deterministic policy gradients
See Pfau & Vinyals (2016)

9:use the ADAM Optimizer

Optim. Adam rules!
- See Radford et. Al. 2015
Use of SGD for discriminator and ADAM for generator

10:track Failures Early

D loss goes to 0:failure mode
Check norms of gradients:if they is over things is screwing up
When things is working, D loss have low variance and goes down over time vs have huge variance and spiking
If loss of generator steadily decreases, then it's fooling D with garbage (says Martin)

11:dont balance loss via statistics (unless you had a good reason to)

Dont try to find a (number of g/number of D) schedule to Uncollapse training
It ' s hard and we've all tried it.
If You do try it, there is a principled approach to it, rather than intuition

For example

while lossD > A:  train Dwhile lossG > B:  train G

12:if You has labels, use them

If you had labels available, training the discriminator to also classify the samples:auxillary Gans

13:add noise to inputs, decay over time

Add some artificial noise to inputs to D (Arjovsky et al., Huszar, 2016)
- http://www.inference.vc/instance-noise-a-trick-for-stabilising-gan-training/
- Https://openreview.net/forum?id=Hk4_qw5xe
Adding Gaussian noise to every layer of generator (Zhao et al Ebgan)
- Improved Gans:openai code also have it (commented out)

: [Notsure] Train discriminator More (sometimes)

Especially when you have noise
Hard to find a schedule of number of D iterations vs G iterations

: [Notsure] Batch discrimination

Mixed Results

16:discrete variables in Conditional Gans

Use an embedding layer
Add as additional channels to images
Keep embedding dimensionality Low and upsample to match image channel size

Authors

Soumith Chintala
Emily Denton
Martin Arjovsky
Michael Mathieu

(turn) How to Train a GAN? Tips and tricks to make Gans

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More