[Deep-learning-with-python] Gan image generation

Source: Internet
Author: User
Tags keras

GAN

The Generation countermeasure Network (GAN), introduced by Goodfellow and others in 2014, is an alternative to VAE for learning the potential space of the image . They are statistically almost indistinguishable from real images by forcing an image to produce a fairly realistic synthetic image.

The intuitive way to understand Gan is to imagine a forger trying to create a fake Picasso. At first, the task of counterfeiters was very bad. He mixes some of his fakes with real Picasso and shows them all to the art dealer. The art dealer evaluates each painting for authenticity and gives false feedback about why Picasso looks like Picasso. The forger returned to his studio to prepare some new fakes. Over time, counterfeiters are increasingly able to imitate Picasso's style, and art dealers increasingly specialize in discovering fakes. Finally, they held some excellent fake Picasso in their hands.

This is the meaning of Gan: Forge networks and expert networks, each with the best training. Therefore, Gan consists of two parts:

    • Build Network (Generator): Takes random vectors (random points in potential space) as input and decodes them into synthetic images;
    • Discern Network (discriminator): Takes an image (real or synthetic) as input, and predicts whether the image is from a training set or created by the generator network.

The generator network is trained to spoof the Discriminator network, so as the training progresses, it gradually produces more lifelike images: Artificial images cannot be distinguished from real images, as long as the discriminator network is not able to identify two images. At the same time, the discriminator constantly adapts to the ability of the generator to improve gradually, and sets a high degree of realism for the generated image. Once the training is finished, the generator is able to convert any point in its input space into a trusted image. Unlike VAE, this potential space has less explicit guarantees for meaningful structures, and in particular, it is not continuous.
[Image upload failed ...] (image-599f61-1536326082049)]

It is important to note thatGan is a system with an optimal minimum value and is not fixed . Typically, a gradient drop involves rolling down a hill in a static loss. But with Gan, every step down the hill will change the landscape. This is a dynamic system in which the optimization process seeks not the least, but a balance between two forces . For this reason, Gan is notoriously difficult to train -making Gan work requires a lot of careful adjustment of the model architecture and training parameters.

Gan implementation

Use Keras to implement a simple Gan network: Both Dcgan,generator and discriminator are made up of convolutional networks. Use the Conv2dtranspose network layer to sample images on generator.

Train on the cifar10,50000 32x32 RGB picture DataSet. For training to be easier, use only the "Frog" class picture.

Implementing GAN Network processes:

    1. The Generator network converts (Latent_dim,) vectors into (32,32,3) images;
    2. Discriminator (32,32,3) image map to 2 classification score, the probability of getting the picture is true;
    3. The GAN network combines generator and discriminator: gan (x) = Discriminator (generator (x)). The GAN network maps the hidden space vector to the discriminator to discriminate the probability that the generator is generated by the hidden space vector.
    4. Use real, fake pictures with real/fake tags to train discriminator;
    5. To train generator, you can use the GAN model to lose the gradient of the generator weight. This means that in each step, the weight of the generator is moved to the direction that the discriminator is more likely to classify the image decoded by the generator as "true." In other words, you train the generator to cheat the discriminator.
A Bag of tricks

It is well known that the process of training Gan and adjusting Gan is very difficult. You should remember some known tricks. Like most things in deep learning: These techniques are heuristic rather than theoretical support guidelines. they are supported by an intuitive understanding of the phenomenon at hand, and they already know that it works well in experience, although not necessarily in every case.
The following are some of the techniques used to implement GAN generators and discriminator. It is not an exhaustive list of Gan-related techniques; you will find more in Gan literature:

    • Generator uses Tanh as the last-level activation function, rather than sigmoid;
    • A normal distribution (Gaussian distribution) is used instead of uniformly distributed in the implicit space sampling.
    • For robustness, randomness can be added. as Gan training results in a dynamic balance, gan may get bogged down in a variety of ways. The introduction of randomness during training helps prevent this . We introduce randomness in two ways: by using dropout in the discriminator and by adding random noise to the discriminator's label.
    • sparse gradients may hinder Gan training . In deep learning, sparsity is usually the ideal attribute, but not in Gan. Two things can cause gradient thinning: maximum pool operation and Relu activation . We recommend that you use a cross-step convolution instead of a maximum pooling, and we recommend using the Leakyrelu layer instead of relu activation. It is similar to Relu, but it relaxes the sparse constraints by allowing small negative activations.
    • In the resulting image, you will typically see a checkerboard artifact caused by uneven coverage of pixel space in the generator (see Figure 8.17). To solve this problem, we use a kernel size that can be divisible by the step size whenever we use a conv2dtranpose or conv2d step in the generator and discriminator.

Generator

First, develop a generator model that converts vectors (from potential spaces-random sampling during training) to candidate images. One of the many problems that Gan usually occurs with is the generator card in the generated image that looks like noise. One possible solution is to use dropout on the discriminator and generator.
GAN Generator Network

import kerasfrom keras import layersimport numpy as nplatent_dim = 2height = 32width = 32channels = 3generator_input = keras.Input(shape=(latent_dim,))x = layers.Dense(128 * 16 * 16)(generator_input)x = layers.LeakyReLU()(x)x = layers.Reshape((16, 16, 128))(x)#将输入转换成16*16 128通道的特征图x = layers.Conv2D(256, 5, padding='same')(x)x = layers.LeakyReLU()(x)x=layers.Conv2DTranspose(256, 4, strides=2, padding='same')(x)#上采样32*32x = layers.LeakyReLU()(x)x = layers.Conv2D(256, 5, padding='same')(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(256, 5, padding='same')(x)x = layers.LeakyReLU()(x)#产生32x32 1通道的特征图x = layers.Conv2D(channels, 7, activation='tanh', padding='same')(x)generator = keras.models.Model(generator_input, x)#将(latent_dim,)->(32,32,3)generator.summary()
Discriminator

Next, a discriminator model is developed that takes the candidate images (real or synthetic) as input and divides them into two categories: "generated image" or "real image from training set".
Gans discriminator Network

discriminator_input = layers.Input(shape=(height, width, channels))x = layers.Conv2D(128, 3)(discriminator_input)x = layers.LeakyReLU()(x)x = layers.Conv2D(128, 4, strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(128, 4, strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(128, 4, strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Flatten()(x)x = layers.Dropout(0.4)(x)x = layers.Dense(1, activation='sigmoid')(x)#二分类discriminator = keras.models.Model(discriminator_input, x)discriminator.summary()discriminator_optimizer = keras.optimizers.RMSprop(lr=0.0008,        clipvalue=1.0,decay=1e-8)discriminator.compile(optimizer=discriminator_optimizer,    loss='binary_crossentropy')
Against the network

Finally, set up Gan, which links the generator and discriminator. After training, the model will move the generator in one direction, thereby improving its ability to spoof the discriminator. This model translates potential spatial points into categorical decisions ("false" or "true") and it means training with always "These are real images" tags. Therefore, training Gan will update the weight of the generator. Make the discriminator more likely to predict the "real" way when viewing false images. It is important to note that the discriminator is set to freeze during training (not trained): The weights are not updated when you train gan. If the discriminator weights can be updated during this process, then the training discriminator will always predict "real", which is not what we want!
Against the network

discriminator.trainable = Falsegan_input = keras.Input(shape=(latent_dim,))gan_output = discriminator(generator(gan_input))gan = keras.models.Model(gan_input,gan_output)gan_optimizer = keras.optimizers.RMSprop(lr=0.0004,clipvalue=1.0,        decay=1e-8)gan.compile(optimizer=gan_optimizer,loss='binary_crossentropy')
Training Dcgan

Now it's time to start training. To summarize, this is the process of training loops. For each epoch, do the following:

    1. Plotting random points (random noise) in the potential space;
    2. Generating images using random noise in the generator;
    3. Mixing the resulting image with the actual image;
    4. Use these mixed images to train the discriminator and use the corresponding target: either "real" (for real images) or "false" (for generated images);
    5. Draw a new random point in the potential space;
    6. Using these random vectors to train gan, the goal is "these are real images." "This updates the weights of the generators (only because the discriminator is frozen within Gan) so that they are facing the discriminator to predict" these are real images "for the resulting image: This will train the generator spoofing discriminator.

Gan Training

Import osfrom keras.preprocessing import image (X_train, Y_train), (_, _) = Keras.datasets.cifar10.load_data () X_train = X_ Train[y_train.flatten () = = 6] #第6类x_train = X_train.reshape ((x_train.shape[0],) + (height, width, channels)). As     Type (' float32 ')/255.iterations = 10000batch_size = 20save_dir = ' Your_dir ' #保存生成图片start = 0for step in range (iterations): Random_latent_vectors = Np.random.normal (size= (Batch_size, Latent_dim)) #正态分布随机取点 generated_images = Gene Rator.predict (random_latent_vectors) #fake图 stop = start + batch_size real_images = X_train[start:stop] #混合真, Fake Picture combined_images = Np.concatenate ([Generated_images, real_images]) #标签 labels = np.concatenate ([Np.ones (Batch _size, 1)), Np.zeros ((batch_size, 1))] Labels + = 0.05 * Np.random.random (labels.shape) #加随机噪声 d_loss = di        Scriminator.train_on_batch (combined_images, labels) random_latent_vectors = Np.random.normal (size=, Latent_dim)) IsleadIng_targets = Np.zeros ((batch_size, 1)) #gan训练: Training generator, fixed discriminator A_loss = Gan.train_on_batch (random_latent  _vectors, misleading_targets) Start + = Batch_size if Start > Len (x_train)-Batch_size:start = 0 if step% = = 0: #每100步保存一次 gan.save_weights (' Gan.h5 ') print (' Discriminator loss: ', D_loss) p Rint (' Adversarial loss: ', a_loss) img = image.array_to_img (generated_images[0] * 255., scale=false) Img.save (Os.path.join (Save_dir, ' Generated_frog ' +str (step) + '. png ')) img = image.array_to_img (real_images [0] * 255., scale=false) Img.save (Os.path.join (Save_dir, ' Real_frog ' + str (STEP) + '. png '))

During training, you may see a significant increase in adversarial losses, while discriminant losses tend to be 0-the discriminator may eventually dominate the generator. If this is the case, try to reduce the discriminator learning rate and increase the discriminator loss rate dropout.

Summary
    • Gan consists of the discriminator Network and the generator network. Train the discriminator to differentiate between the output of the generator and the real image from the training data set, and train the generator to spoof the discriminator. It is important to note that the generator group cannot see the image directly from the training set, and its information about the data comes from the discriminator.
    • Gan is difficult to train because the training of Gan is a dynamic process , rather than a simple gradient descent process with a fixed loss. Gan proper training requires some heuristic techniques, as well as a large number of parameter adjustments.
    • Gan can produce highly realistic images. However, unlike VAE, the potential space they learn does not have a neat continuous structure , and therefore may not be suitable for some practical applications, such as image editing through potential space concept vectors .

[Deep-learning-with-python] Gan image generation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.