Generation of anti-network (GAN) applied to image classification

Source: Internet
Author: User
Tags generator generative adversarial networks

In recent years, deep learning technology has been widely used in various data processing tasks, such as image, voice and text. The generation of Anti-network (GAN) and reinforcement Learning (RL) has become the two "pearl" in the deep learning framework. Intensive learning is mainly used for decision-making problems, and the main applications are games, such as the Alphago of DeepMind teams. Because my research direction is the problem of supervised classification of images, so this paper mainly explains the application of generation countermeasure network and its classification problem. build a network framework for confrontation

Generating a confrontation network (generative adversarial networks, referred to as Gan) is a learning framework first proposed by Ian J. Goodfellow in 2014, speaking of Ian J. Goodfellow himself, who may not be impressed by everyone, But his teacher is one of the "deep learning three Giants" Yoshua Bengio (the other two are Hinton and LeCun), it is worth mentioning that the Theano deep learning Framework is also developed by their team, the beginning of the symbolic calculation of the precedent. On the position of Gan in the field of machine learning, here we cite a lecun evaluation,

"There is many interesting recent development in deep learning ... The most important one and in my opinion are adversarial training (also called GAN for generative adversarial Networks). This, and the variations, which is now being proposed was the most interesting idea in the last years in ML, in my opinio N. "

The traditional generation model needs to define a parameter expression of a probability distribution and then train the model by maximizing the likelihood function, such as Deep Boltzmann Machine (RBM). The gradient expression of these models usually contains expected items, which makes it difficult to obtain accurate solutions, which generally require approximation, for example, in an RBM, the convergence of Markov chain can be used to obtain a random sample in accordance with a given distribution. In order to overcome the difficulty of solving accuracy and computational complexity, J-Bull creatively proposed to generate a network of confrontation. Gan models do not need to directly represent the likelihood function of the data, but can generate samples that have the same distribution as the original data.
Unlike conventional deep learning models (such as CNN, DBN, Rnn), the GAN model employs two independent neural networks, called "Generator" and "discriminator", which are used to generate ' from input noise signals ' A high-dimensional sample that looks like a real sample, the discriminant is used to differentiate between the sample generated by the generator and the actual training sample (which belongs to a two classification problem). The framework of its model structure is as follows

Gans is based on a minimax mechanism rather than the usual optimization problem, it defines the loss function as the maximization of the discriminant and the minimization of the generator, and the author also proves that the GAN model can finally converge, and the discriminant model and generator model obtain the optimal solution respectively. The x represents the sample data, p (z) indicates the input noise distribution of the generator, G (z; \theta_{g}) represents the noise to the sample space mapping, D (x) represents the probability that x belongs to the real sample rather than the sample, then the GAN model can be defined as the following optimization problem,
\min_{g}\max_{d}v (d, G) =e_{x~p_{data} (x)}[log D (x)]+e_{z~p_{z} (z)}[log (1-d (g (z)))] from the above formula can be seen, in the model training process, On the one hand, it is necessary to modify the discriminant D to maximize the value function V, that is, d (x) maximization and D (g (z)) minimization, its mathematical meaning is to maximize the classifier classification training sample and generate the correct rate of sample, on the other hand, the need to modify the generator G, the value function v minimized, that is, D The mathematical meaning of the generator is to produce samples very similar to the training samples, which is the origin of adversarial in Gan's name. J Bull proposed alternating optimization D and g (K-Step optimization for D, 1-step optimization of G), the specific training process is as follows,

application of Gan in classification problem

The early GAN model was mainly applied to unsupervised learning tasks, that is, the generation and training samples have the same distribution data, can be 1-D signal or two-dimensional image. When applying GAN to classification problems, you need to make changes to the network, and here is a brief explanation of the scenarios presented in the two articles, "improved techniques for Training Gans" and "Semantic segmentation using Adversarial Networks ", the former can be classified in semi-supervised classification algorithm, while the latter belongs to supervised classification algorithm. semi-Supervised classification method

When Gan is applied to semi-supervised classification tasks, it is only necessary to make a slight change to the initial GAN structure, that is to replace the output layer of the discriminator model with the Softmax classifier. Assuming that the training data has a C class, then in the training of Gan model, generator simulated samples can be classified as the C+1 class, and the Softmax classifier also adds an output neuron to represent the discriminator model's input as "false data" probability, here the " False data "refers to generator generated samples. Because the model can take advantage of tagged training samples or can be learned from untagged generated data, it is called a "semi-supervised" classification. Define the loss function as follows, where l_{unsupervised} is a standard Gan optimization problem, and the specific training method for the model can be found in the original.

Supervised Classification Method

It is conceivable that when applied to the problem of supervised classification based on pixels (the training data set in the article is similar to the face recognition data set, the difference is that the label y of a single image is the same size as the input face image), and the generator model in Gan has no effect. The original author's network framework contains two classifier models, one for pixel-based classification of a single image, and another classifier called a confrontation network, which is used to differentiate between the tag graph and the predicted probability plot, and the purpose of the proposed network is to make the resulting probability prediction map more consistent with the real tag graph. The specific network structure is as follows,

The training image for \{x_{n}, y_{n}, N=1,..., n\}, s_{x} represents the predicted probability plot, a (x, y) to represent the probability that the network prediction Y is the true tag graph of x, \theta_{s}, \theta_{a} Representing the parameters of the segmentation model and the adversarial model respectively, the loss function can be defined as follows, L (\theta_{s}, \theta_{a}) =\sum_{n=1}^{n}l_{mce} (S (X_{n}), y_{n} )-\LAMBDA[L_{BCE} (A (X_{n}, Y_{n}), 1) +L_{BCE} (A (X_{n}, S (X_{n})), 0)]
Where L_{mce} (Y_1, y) represents the Multi-Class cross entropy loss between the predicted probability plot y_1 and the real tag graph y, while L_{BCE} (z1, z) =-[zln z1+ (1-z) ln (1-Z1)], That represents the binary cross entropy loss. Similar to the training methods of Gan, the model training is done by iterative training adversarial model and segmentation model. In training the adversarial model, it is equivalent to optimizing the following expression, whose physical meaning is to make the adversarial model more capable of distinguishing between probabilistic and real tag graphs.
\SUM_{N=1}^{N}L_{BCE} (A (X_{n}, Y_{n}), 1) +L_{BCE} (A (X_{n}, S (X_{n})), 0)
In training the segmentation model, it is equivalent to optimizing the following expression, the physical meaning is that the generated probability map is not only similar to the corresponding tag graph, and the adversarial model is difficult to distinguish between the open. \SUM_{N=1}^{N}L_{MCE} (S (X_{n}), y_{n})-\lambda L_{BCE} (A (X_{n}, S (X_{n})), 0)

Reference: Generative adversarial Nets, Ian J. Goodfellow.
Improved techniques for Training Gans. Tim Salimans, Ian Goodfellow.
Semantic Segmentation using adversarial Networks. Pauline Luc.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.