Objective
This paper first introduces the generative model, and then focuses on the research and development of generating the countermeasure network (generative adversarial network) in the generation model (generative Models). According to Gan main thesis, gan applied paper and gan related papers, the author sorted out 45 papers in the past two years, focused on combing the links and differences between the main papers, revealing the research context of generative confrontation network.
The papers involved are:
[1] Goodfellow Ian, Pouget-abadie J, Mirza M, et al. generative adversarial nets[c]//advances in neural information Sing Systems. 2014:2672-2680.
[2] Mirza M, Osindero S. Conditional generative adversarial nets[j]. Computer Science, 2014:2672-2680.
[3] Denton E L, Chintala S, Fergus R. Deep generative Image Models using a Laplacian pyramid of adversarial Dvances in neural information processing systems. 2015:1486-1494.
[4] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial net WORKS[J]. ARXIV preprint arxiv:1511.06434, 2015.
[5] Im D J, Kim C D, Jiang H, et al. generating images with recurrent adversarial networks[j]. ArXiv preprint arxiv:1602.05110, 2016.
[6] Larsen a B L, Sønderby S K, Winther O. autoencoding beyond pixels by using a learned similarity metric[j]. ARXIV preprint arxiv:1512.09300, 2015.
[7] Wang X, Gupta A. Generative Image modeling using Style and Structure adversarial networks[j]. ArXiv preprint arxiv:1603.05631, 2016.
[8] Chen X, Duan Y, Houthooft R, et al infogan:interpretable representation Learning by information maximizing E adversarial nets[j]. ArXiv preprint arxiv:1606.03657, 2016.
[9] Kurakin A, Goodfellow I, Bengio S. Adversarial examples physical in the world[j]. ArXiv preprint arxiv:1607.02533, 2016.
[Ten] Odena A. semi-supervised Learning with generative adversarial networks[j]. ArXiv preprint arxiv:1606.01583, 2016.
[One] Springenberg J T. Unsupervised and semi-supervised Learning with categorical generative adversarial]. ARXIV preprint arxiv:1511.06390, 2015. 2. Generative warfare network, thought and training method of generative adversarial Networks 2.1 gan
gan[goodfellow Ian,gan] Inspired by the game theory of the two-person zero-sum game (two-player game), by [Goodfellow et al, NIPS 2014] Groundbreaking to put forward. In the two-person zero-sum game, the interest of the two-bit game is zero or a constant, that is, the other party has the gain, the other side will lose. The two-bit players in the GAN model are composed of the generative model (generative models) and the discriminant models (discriminative model) respectively. The model G captures the distribution of sample data, and the discriminant model is a two classifier that estimates the probability of a sample from training data rather than generating data. G and D are generally nonlinear mapping functions, such as multilayer perceptron, convolution neural network and so on. As shown in Figure 2-1, the left image is a discriminant model, when input training data x, expect output high probability (close to 1); the lower half of the right figure is a model, and the input is a random noise z that obeys a simple distribution (for example, a Gaussian distribution), and the output is a generated image of the same size as the training image. To the discriminant model D input to generate samples, for d the expected output low probability (judged to generate samples), for the generation of model G to try to deceive D, so that the discriminant model output high probability (misjudged as a real sample), thus creating competition and confrontation.
The
gan model has no loss function, and the optimization process is a "two-dollar Minimax game (minimax two-player game)" Problem:
This is about value functions (value function) that discriminate network D and generate network G. The Training network d allows the maximum probability to be divided into the Training sample label (maximizing log D (x)), training network G minimize log (1–d (G (z))), i.e. maximizing D loss. In the training process, a fixed one, update the other network parameters, alternating iterations, so that each other's error maximization, finally, G can estimate the distribution of sample data. The generation model G implicitly defines a probability distribution PG, and we want PG to converge to the real distribution of the data pdata. The paper proves that this minimax game has the optimal solution when the PG = pdata, that is, the Nash equilibrium is achieved, at this time the model G recovers the distribution of training data, and the accuracy of discriminant model D equals 50%.
Figure 2-2 Generation vs. network algorithm process advantages and disadvantages of 2.2 gan
Compared to other generative models, the generation counter network has the following four advantages "OpenAI Ian Goodfellow's Quora question and answer": depending on the actual results, they appear to produce better samples than other models (the image is sharper and clearer). Generating a confrontational network framework can train any type of generator network (theoretically-in practice, it is difficult to train a generation network with discrete outputs with reinforce). Most other frameworks require that the generator network have some specific function forms, such as the output layer being Gauss. It is important that all other frameworks require a generator network spread over 0 quality (Non-zero mass). Generation of adversarial networks can learn to generate points only on thin manifold that are close to the data. There is no need to design a model that follows any kind of factorization, and any generator network and any discriminator can be useful. There is no need to use the Markov chain to sample repeatedly, not to infer in the learning process (inference), to avoid the problem of approximate calculation of the tricky probability.
A sample is less run time than PIXELRNN. GAN produces one sample at a time, and pixelrnn needs to produce a sample of the samples at a time.
Compared with VAE, it has no lower limit of change. If the discriminator network is perfectly fit, the generator network will be perfectly restored to the training distribution. In other words, the various adversarial generation networks are asymptotically consistent (asymptotically consistent), while VAE has a certain bias.
Compared with the depth Boltzmann machine, there is neither a lower limit for change nor a tricky partitioning function. Its samples can be generated at once, rather than by repeatedly applying the Markov chain operator (Markov chain operator).
Compared with GSN, its samples can be generated one at a time rather than repeatedly using Markov chain operators.
There is no limit to the size of the latent code compared to Nice and real NVE. The main problems existing in Gan are: Solving the problem of non-convergence (no convergence).
At present, the basic problem is: all the theories think that GAN should have excellent performance on Nash equilibrium (Nash equilibrium), but the gradient drop can guarantee Nash equilibrium only in the case of convex function. When both sides of the game are represented by neural networks, it is possible, without actually achieving a balance, to keep the adjustment to their own strategy "OpenAI Ian Goodfellow's Quora". Difficult to train: crash problem (collapse problem)
The GAN model is defined as a minimax problem with no loss function, and it is difficult to tell whether progress is being made during the training process. The learning process of Gan may have a crash problem (collapse problem), the generator starts to degenerate, always generates the same sample point, cannot continue to learn. When the generation model crashes, the discriminant model also points similar directions to similar sample points, and the training cannot continue. "Improved techniques for training Gans" does not need to be modeled beforehand, the model is too free and uncontrollable.
Compared with other generative models, the competitive mode of Gan no longer requires a hypothetical data distribution, that is, formulate p (x) is not required, but a distribution is used to sample the sampling directly, so that the real data can be fully approximated theoretically, which is the biggest advantage of Gan. However, the disadvantage of this method that does not require prior modeling is that it is too free, and for larger pictures, more pixel cases, the simple GAN approach is less manageable. In Gan[goodfellow Ian, Pouget-abadie J], the update process for each learning parameter was set to D update K-back, and G was updated 1 times, also for similar considerations. Reference
[1] Goodfellow Ian, Pouget-abadie J, Mirza M, et al. generative adversarial nets[c]//advances in neural information Pro Cessing Systems. 2014:2672-2680.
[2] depth | OpenAI Ian Goodfellow's Quora quiz: a roaring machine to learn life. Author: Ian Goodfellow. The heart of the machine compiles
[3] Barone A V m. towards cross-lingual distributed representations without parallel text trained with Adversa Rial Autoencoders[j]. ArXiv preprint arxiv:1608.02996, 2016.