Introduction to the Anti-neural network (adversarial Nets) [1]

Source: Internet
Author: User
Tags dashed line nets git clone generative adversarial networks

    • Introduction to Anti-NN
    • Concept Introduction
    • The origin of the name and the process of confrontation
    • A model against NN
    • Models and training to combat nn
    • Discriminating the optimal value of network D
    • Gaussian distribution of simulated learning
    • Test results against NN
    • Installation and operation of the code to generate against NN
    • Anti-Network related papers
    • Paper citations
I. Introduction to the anti-NN

Daniel Ian J. Goodfellow's 2014-year-old generative adversative Nets first proposed a network-based model that has yielded good results in the field of deep learning generation models in just two years. In this paper, a new framework is proposed, which can be used to estimate the generation model, which can be considered as a breakthrough in unsupervised expression learning (unsuperivised representation learning) compared with previous algorithms, and now the main application is to generate natural images ( Natural images).

Second, the concept of introduction

Two models of machine learning--generating models and discriminant models.

    • Build model (Generative): Learning about the federated distribution of the observed data, such as 2-d: P (x, y).
    • Discriminant model: The conditional probability distribution P (y|x) is learned, that is, the distribution of non-observable variables under the premise of observing the variable x.

      In layman's terms, we want to generate new data by generating models to learn the distribution from the data. For example, learn from a large number of images, and then create a new photo.
      And for the discriminant model, the most classic applications, such as supervised learning, then for the classification problem, I want to know the input x, output y, then the value of Y can be understood as the data label.

The anti-neural network is a discriminant model (discriminative, D) and a generative model (generative, G).

Iii. The origin of the name and the process of confrontation

Just now the confrontation network is actually a D and a G, so how does the G and D fight?
Let's look at one of these scenarios:

    • D is the bank's teller.
    • G is a crook, specializing in the manufacture of counterfeit money.

      Then one of the confrontation process is, for D, continuous learning, to carry out real currency judgment, G is constantly learning, manufacturing more like real coin, to deceive D, and the final training result is--d can be very good to distinguish true counterfeit money, but G made "such as false replacement" of counterfeit money, and D can't tell.

For the network, D and G are a neural network model--MLP, then the output of D (discriminant model) is a constant, which represents the possibility of "from real currency". And for the output of G is a set of vectors, and this vector represents "counterfeit money."

Four, the model against NN


Picture 1

The z in Figure 1 is the input of G, which is generally the data generated by the Gaussian random distribution, where the output of G is g (z), and for real data, it is usually a picture, and the distribution variable is represented by X. Then for the output of D is the possibility of judging from X, is a constant.

V. Training and optimization to combat nn

For G, to constantly cheat D, then that is:

max log(D(G(z)))                           目标函数1

For D, to keep learning to prevent D from cheating, then that is:

max log(D(x)) + log(1 - D(G(z)))           目标函数2

Using gradient Descent method (GD) training, the gradient is as follows.

For target function 1 :

For target function 2 :

Training process

paper [1] gives the algorithm 1, the details please check the original, is the first training D, and then training D. The paper also gives the formula to prove the convergence of the algorithm.

Several trick of training:

    • The use of dropout mentioned in the paper (should be maxout layer)
    • Each time the training of D, in the training of G, to prevent overfitting.
    • Pre-training is possible prior to training.
Six, the optimal value of the discriminant network D

Defines the probability density distribution function (PDF) of x as
The PDF of G (Z) is defined as

So for each training, g if fixed, the optimal value of the output D can be considered

Moreover, the result of the last training is d=1/2=0.5. That is, at this point:

For this detailed proof you can view the original text.

Vii. Experimental results against NN

The data set used in paper 1 includes, MNIST a), TFD b), CIFAR-10 c) d), data set. For different datasets, the original text uses a different network model.


Figure 2-Experimental results

The model is as follows.

Data Set G-Model D model
Mnist relu+sigmoid activation function Maxout+sigmoid
Tfd No mention No mention
CIFAR-10 c) Full Connection + activation function Maxout+sigmoid
CIFAR-10 d) Anti-convolution layer + activation function Maxoutconv+sigmoid

Detailed model introduction See YAML files in open source projects

Https://github.com/goodfeli/adversarial

Gauss distribution of simulated learning

A picture given in the paper. As follows:

    • D, blue, dashed line
    • X, black, dotted line
    • G, green, solid line

This is done by fighting the network, allowing G (z) to learn the distribution of x, while X is in accordance with the Gaussian distribution, and z is evenly distributed. From (a) to (d) is the process of continuous learning, at the beginning, G (z) and x PDF is not consistent, because the first g (Z) is not possible to generate the target distribution of data from random variables. However, finally, we can also see (d) is the last to learn the image, where the bottom two parallel lines, Z through G () map has been and the distribution of x is exactly the same (of course this is an ideal case ), and the output of D is a straight line, as mentioned above, d () = 1/2 a constant.

TensorFlow Related Codes

(1) Discriminator ' s loss

batch=tf.Variable(0)obj_d=tf.reduce_mean(tf.log(D1)+tf.log(1-D2))opt_d=tf.train.GradientDescentOptimizer(0.01)              .minimize(1-obj_d,global_step=batch,var_list=theta_d)

(2) Generator ' s loss

batch=tf.Variable(0)obj_g=tf.reduce_mean(tf.log(D2))opt_g=tf.train.GradientDescentOptimizer(0.01)              .minimize(1-obj_g,global_step=batch,var_list=theta_g)

(3) Training algorithms 1, Goodfellow et al. 2014

forin range(TRAIN_ITERS):    x= np.random.normal(mu,sigma,M)        z= np.random.random(M)          sess.run(opt_d, {x_node: x, z_node: z})    //先训练D     z= np.random.random(M)         sess.run(opt_g, {z_node: z})               //在训练G

The above code is an example of the tensorflow implementation of generating Gaussian distributions against NN.

The installation and operation of fellow thesis code of Daniel Good

Goodfellow, the author of the Anti-network, also open up his own code.

(1) Project link

Adversarial links

(2) Download and dependent library installation

    • Project depends on pylearn2, install pylearn2 first
    • I git clone has pylearn2,adversarial two items. Three environment variables were added (added according to their own path).
export PYLEARN2_VIEWER_COMMAND="eog --new-instance"export PYLEARN2_DATA_PATH=/home/dataexport PYTHONPATH=/home/code
    • Other Python dependent libraries can be installed via PIP or Apt-get.

(3) Training and testing

    • Call Pylearn2 's train.py and Mnist.yaml for training.
pylearn2/scripts/train.py ./adversarial/mnist.yaml

Test as follows

    • Run under the adversarial directory
python show_samples_mnist_paper.py mnist.pkl
X. Anti-Network related papers and applications

The blogger made an open source project and collected paper and papers related to the network.
Welcome to star and contribution.

Https://github.com/zhangqianhui/AdversarialNetsPapers

Application to combat NN. These apps can all be found in my open source project .

(1) The paper [2] uses CNN for image generation, where D is used for classification and has a good effect.

(2) the thesis [3] uses the prediction of the video frame against NN, which solves the problem that other algorithms can easily produce fuzzy blocks.

(3) The thesis [4] uses the anti-NN in the visual manipulation application of image stylized processing.

Xi. citation of papers

[1] Generative adversarial networks.goodfellow.
[2] Unsupervised representation learning with deep convolutional generative adversarial Networks.alec Radford.
[3] Deep multi-scale video prediction beyond mean square error. Michael Mathieu.
[4] Generative Visual manipulation on the Natural Image Manifold.jun-yan ZHU.ECCV 2016.

Introduction to the Anti-neural network (adversarial Nets) [1]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.