Introduction to the Anti-neural network (adversarial Nets) [1]

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction to Anti-NN
Concept Introduction
The origin of the name and the process of confrontation
A model against NN
Models and training to combat nn
Discriminating the optimal value of network D
Gaussian distribution of simulated learning
Test results against NN
Installation and operation of the code to generate against NN
Anti-Network related papers
Paper citations

I. Introduction to the anti-NN

Daniel Ian J. Goodfellow's 2014-year-old generative adversative Nets first proposed a network-based model that has yielded good results in the field of deep learning generation models in just two years. In this paper, a new framework is proposed, which can be used to estimate the generation model, which can be considered as a breakthrough in unsupervised expression learning (unsuperivised representation learning) compared with previous algorithms, and now the main application is to generate natural images ( Natural images).

Second, the concept of introduction

Two models of machine learning--generating models and discriminant models.

Build model (Generative): Learning about the federated distribution of the observed data, such as 2-d: P (x, y).
Discriminant model: The conditional probability distribution P (y|x) is learned, that is, the distribution of non-observable variables under the premise of observing the variable x.
In layman's terms, we want to generate new data by generating models to learn the distribution from the data. For example, learn from a large number of images, and then create a new photo.
And for the discriminant model, the most classic applications, such as supervised learning, then for the classification problem, I want to know the input x, output y, then the value of Y can be understood as the data label.

The anti-neural network is a discriminant model (discriminative, D) and a generative model (generative, G).

Iii. The origin of the name and the process of confrontation

Just now the confrontation network is actually a D and a G, so how does the G and D fight?
Let's look at one of these scenarios:

D is the bank's teller.
G is a crook, specializing in the manufacture of counterfeit money.
Then one of the confrontation process is, for D, continuous learning, to carry out real currency judgment, G is constantly learning, manufacturing more like real coin, to deceive D, and the final training result is--d can be very good to distinguish true counterfeit money, but G made "such as false replacement" of counterfeit money, and D can't tell.

For the network, D and G are a neural network model--MLP, then the output of D (discriminant model) is a constant, which represents the possibility of "from real currency". And for the output of G is a set of vectors, and this vector represents "counterfeit money."

Four, the model against NN

Picture 1

The z in Figure 1 is the input of G, which is generally the data generated by the Gaussian random distribution, where the output of G is g (z), and for real data, it is usually a picture, and the distribution variable is represented by X. Then for the output of D is the possibility of judging from X, is a constant.

V. Training and optimization to combat nn

For G, to constantly cheat D, then that is:

max log(D(G(z)))                           目标函数1

For D, to keep learning to prevent D from cheating, then that is:

max log(D(x)) + log(1 - D(G(z)))           目标函数2

Using gradient Descent method (GD) training, the gradient is as follows.

For target function 1 :

For target function 2 :

Training process

paper [1] gives the algorithm 1, the details please check the original, is the first training D, and then training D. The paper also gives the formula to prove the convergence of the algorithm.

Several trick of training:

The use of dropout mentioned in the paper (should be maxout layer)
Each time the training of D, in the training of G, to prevent overfitting.
Pre-training is possible prior to training.

Six, the optimal value of the discriminant network D

Defines the probability density distribution function (PDF) of x as
The PDF of G (Z) is defined as

So for each training, g if fixed, the optimal value of the output D can be considered

Moreover, the result of the last training is d=1/2=0.5. That is, at this point:

For this detailed proof you can view the original text.

Vii. Experimental results against NN

The data set used in paper 1 includes, MNIST a), TFD b), CIFAR-10 c) d), data set. For different datasets, the original text uses a different network model.

Figure 2-Experimental results

The model is as follows.

Data Set	G-Model	D model
Mnist	relu+sigmoid activation function	Maxout+sigmoid
Tfd	No mention	No mention
CIFAR-10 c)	Full Connection + activation function	Maxout+sigmoid
CIFAR-10 d)	Anti-convolution layer + activation function	Maxoutconv+sigmoid

Detailed model introduction See YAML files in open source projects

Https://github.com/goodfeli/adversarial

Gauss distribution of simulated learning

A picture given in the paper. As follows:

D, blue, dashed line
X, black, dotted line
G, green, solid line

This is done by fighting the network, allowing G (z) to learn the distribution of x, while X is in accordance with the Gaussian distribution, and z is evenly distributed. From (a) to (d) is the process of continuous learning, at the beginning, G (z) and x PDF is not consistent, because the first g (Z) is not possible to generate the target distribution of data from random variables. However, finally, we can also see (d) is the last to learn the image, where the bottom two parallel lines, Z through G () map has been and the distribution of x is exactly the same (of course this is an ideal case ), and the output of D is a straight line, as mentioned above, d () = 1/2 a constant.

TensorFlow Related Codes

(1) Discriminator ' s loss

batch=tf.Variable(0)obj_d=tf.reduce_mean(tf.log(D1)+tf.log(1-D2))opt_d=tf.train.GradientDescentOptimizer(0.01)              .minimize(1-obj_d,global_step=batch,var_list=theta_d)

(2) Generator ' s loss

batch=tf.Variable(0)obj_g=tf.reduce_mean(tf.log(D2))opt_g=tf.train.GradientDescentOptimizer(0.01)              .minimize(1-obj_g,global_step=batch,var_list=theta_g)

(3) Training algorithms 1, Goodfellow et al. 2014

forin range(TRAIN_ITERS):    x= np.random.normal(mu,sigma,M)        z= np.random.random(M)          sess.run(opt_d, {x_node: x, z_node: z})    //先训练D     z= np.random.random(M)         sess.run(opt_g, {z_node: z})               //在训练G

The above code is an example of the tensorflow implementation of generating Gaussian distributions against NN.

The installation and operation of fellow thesis code of Daniel Good

Goodfellow, the author of the Anti-network, also open up his own code.

(1) Project link

Adversarial links

(2) Download and dependent library installation

Project depends on pylearn2, install pylearn2 first
I git clone has pylearn2,adversarial two items. Three environment variables were added (added according to their own path).

export PYLEARN2_VIEWER_COMMAND="eog --new-instance"export PYLEARN2_DATA_PATH=/home/dataexport PYTHONPATH=/home/code

Other Python dependent libraries can be installed via PIP or Apt-get.

(3) Training and testing

Call Pylearn2 's train.py and Mnist.yaml for training.

pylearn2/scripts/train.py ./adversarial/mnist.yaml

Test as follows

Run under the adversarial directory

python show_samples_mnist_paper.py mnist.pkl

X. Anti-Network related papers and applications

The blogger made an open source project and collected paper and papers related to the network.
Welcome to star and contribution.

Https://github.com/zhangqianhui/AdversarialNetsPapers

Application to combat NN. These apps can all be found in my open source project .

(1) The paper [2] uses CNN for image generation, where D is used for classification and has a good effect.

(2) the thesis [3] uses the prediction of the video frame against NN, which solves the problem that other algorithms can easily produce fuzzy blocks.

(3) The thesis [4] uses the anti-NN in the visual manipulation application of image stylized processing.

Xi. citation of papers

[1] Generative adversarial networks.goodfellow.
[2] Unsupervised representation learning with deep convolutional generative adversarial Networks.alec Radford.
[3] Deep multi-scale video prediction beyond mean square error. Michael Mathieu.
[4] Generative Visual manipulation on the Natural Image Manifold.jun-yan ZHU.ECCV 2016.

Introduction to the Anti-neural network (adversarial Nets) [1]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to the Anti-neural network (adversarial Nets) [1]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to the Anti-neural network (adversarial Nets) [1]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support