This is Ian Goodfellow, the Great God of the 2014 years of paper, recently very hot, has not looked, left the pit.
Chinese should be called Confrontation network
The code is written in pylearn2 GitHub address: https://github.com/goodfeli/adversarial/
What:
At the same time harmless two models: a generative model G (obtained data distribution), a differentiating model D (the predictive input is true, or is generated in g)
G's training goal is to maximize the likelihood of D making mistakes, so the more the G model is built.
This frame is like two people playing games.
The entire system uses only BP, no Markov chains, or other neural networks that are reasoning.
Pre-reading questions:
is the input of G a label, or is it a random Gaussian noise?
If it is label, how to use the neural network to write a build model G, with Deconv mody ...
How:
1.G is a simple neural network (e.g., a fully connected hidden layer) His input is a vector (100 D) and then produces a graph as output.
2.D is also a simple neural network (e.g., a fully connected hidden layer) His input is an image and then generates a confidence level (0-1)
Suppose B is batchsize.
The 3.G training is as follows:
A. Create a vector of B random 100-D, such as [ -1,1]
A new image image is obtained from the feedforward vector in B.G
C.D image of Feed-forward picture gets score
D. Use cross entropy to classify, D should consider these generated pictures (fake) as Label = 0, if D is rated label=0, then G's error should be lower (because G do a good job cheated D)
E. Execute a BP (do not update D), get image per pixel gradient
D. Use this gradient to update g
The 4.D training is as follows:
A. Create B/2 picture from G Ground Truth label is 0
B. Select B/2 picture from training set the GT label is 1
C. Put them in a batch.
D. feedforward This batch in D
E.cross Entropy Error
F.update D
Training g A batch, then training D one or more batch (sometimes d is slower, so want more iterations than g)
I check his model (TFD_PRETRAIN/TRAIN.YAML)
G uses a rectifiedlinear and sigmoid (this rectifiedlinear with parameters, not relu)
D with two maxout and a sigmoid (this maxout also takes parameters)
Your current Implementation plan:
1. Data sets using Minist 28*28
2. g input used as 32*32 [ -1,1]rand
3. G using 3*3 1*1 3*3 three-layer conv stride1 before two con followed by Relu
4. D using the original MINISTCNN classification network will be changed to 2 classification
5. Using the graph model, so
Update G:fix D forward G+d (d's input is G's output concat GT file)
Update D:fix G forward G+d
This is what I expected ... Don't know how to do it ... It's all metaphysics.
-----8.27 Update
And the paper is different from ...
The current implementation of the results, is the general D network is relatively strong, g not
G tends to converge to local extremum, and the performance on the minist is 28*28 's full 0 figure
I tried to change D, and I couldn't. Maybe the next step is to Pretrain G.
Also refer to the structure of the original. XJB may not be able to do it.
Current code Address: Https://github.com/layumi/2016_GAN_Matlab
----8.28 Update
Can see my next reading Summary, this article paper is about how to stable train GAN (for example, using Batchnorm,leakyrelu, etc.)
But I know the truth, or training is not good. Qaq
-----9.2 update
First input should be a 100 D vector, using deconv to form g
And then output to a local extremum is a common phenomenon (I remember a map of the distribution is about this)
Now try to improved the method in the paper of Gan to modify the objective function of G. I can see a summary of my recent reading