Dcgan and its tensorflow source __gan

Source: Internet
Author: User

In the previous section, we mentioned that G and D are defined by multilayer perceptron. The best model for image processing in depth learning is CNN, so how to combine CNN with GAN. Dcgan is one of the best attempts in this regard. Source: Https://github.com/Newmu/dcgan_code. Dcgan the author of the paper is implemented with Theano, he also put on the version of other people, this article mainly discusses the TensorFlow version.
TensorFlow version of the source code: Https://github.com/carpedm20/DCGAN-tensorflow

Dcgan converted the above G and D into two convolution neural networks (CNN). But not directly, Dcgan. The structure of convolution neural network has been changed to improve the quality of samples and the speed of convergence, these changes are: cancel all pooling layers. The G-Network uses the transpose convolution (transposed convolutional layer) to sample, and the D network replaces pooling with the convolution of adding strided. Use batch normalization to remove FC layer in D and G, make network become full convolution network G network use Relu as activation function, last layer use Leakyrelu as activation function in Tanh D network

These changes can be seen in the code. Dcgan's paper mentions three important changes to the CNN structure: allconvolutional net (Springenberg et al., 2014) Full convolution network
Discriminant Model D: The space Pool (spatial pooling), which is replaced by a convolution with a step length (strided convolutions), allows the network to learn its own spatial sampling (spatial downsampling).
Y generates the Model G: Using the micro-stride convolution (fractional strided) allows it to learn its own spatial sampling (spatial upsampling) to eliminate the fully connected layer on the convolution feature.
Y (Mordvintsev et al.) The proposed global average pooling is helpful to the stability of the model, but the rate of convergence is impaired.
The first layer of Gan input: the uniform distribution of the noise vector z, because only matrix multiplication, so can be called the full join layer, but the result will be reshape into 4-dimensional tensor, as the beginning of the convolution stack.
For D, the final convolution layer is flatten (the matrix into a vector), and then the sigmoid function is used to process the output.
Build model: Output layer with tanh function, other layer with Relu activation function.
Discriminant model: All layers use Leakyrelu Batch normalization batch standardization.
Solve the problem of training caused by bad initialization, so that the gradient can spread deeper. Stable learning, by normalized input units, so that they mean 0, with the unit variance.
Batch normalization proves the importance of generating model initialization to avoid model crashes: all of the generated samples are at one point (the same sample), which is a frequent failure of training Gans.
The uniform distribution of the GENERATOR:100 dimension z projections to a small space-range convolution representation produces many feature graphs. A series of four-step convolution converts this representation to a 64x64 pixel image. No need to complete the connection or the pool layer.

Configuration

Python
TensorFlow
SciPy
Pillow
(optional) Moviepy (https://github.com/Zulko/moviepy): for visualization
(optional) align&cropped Images.zip (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html): Face data set main.py

The entry program to define the values of the required parameters in advance.
Execution procedure:
Training a model:
$ python main.py--dataset mnist--is_traintrue
$ python main.py--dataset celeba--is_traintrue--is_crop True
To test an existing model:
$ python main.py--dataset mnist
$ python main.py--dataset Celeba--is_crop True
You can also use your own dataset:
$ mkdir Data/dataset_name
Add picture to Data/dataset_name ...
$ python main.py--dataset dataset_name--is_train True
$ python main.py--dataset dataset_name
Training more than the genuine image Source analysis

Flags configure network parameters that can be modified on the command line, such as
$python main.py--image_size--output_size--dataset anime--is_crop true--is_train True--epoch 300
This set of code parameters mainly in the Mnist dataset as a template, if you want to train other datasets, you can modify some of the appropriate parameters. mnist datasets can be downloaded by download.py.
First initialize the Dcgan in the model.py and see if you need training (Is_train). Flags parameter

Epoch \color{red}{\text{epoch}: Training turn defaults to 25
Learning_rate \color{red}{\text{learning_rate}}: The learning rate for Adam \color{navy}{\text{adam}} defaults to 0.0002
Beta1 \color{red}{\text{beta1}}:adam Momentum Term (momentum term of Adam), defaults to 0.5
Train_size \color{red}{\text{train_size}}: Number of training images, default to Np.inf
Batch_size \color{red}{\text{batch_size}}: Number of batch images, defaults to 64. The resulting image is spelled in a picture, so batch_size best to take the square, such as 64,36 and other \color{red}{\text{generated after the picture to fight in a picture, so batch_size best to take the square, such as 64,36 and so on}
Input_height \color{red}{\text{input_height}}: The image height of the image being used (will be the center cropped \color{navy}{\text{center Cropped}}), Default is 108
Input_width \color{red}{\text{input_width}}: The image width of the image being used (will be the center cropped \color{navy}{\text{center cropped}), If you do not specifically specify the default and Input_height
Output_height \color{red}{\text{output_height}}: Image height of the resulting image (will be the center cropped \color{navy}{\text{center cropped}) , the default is 64
Output_width \color{red}{\text{output_width}}: The image width of the resulting image (will be the center cropped \color{navy}{\text{center cropped}), If you do not specifically specify the default and Output_height
DataSet \color{red}{\text{dataset}}: The name of the dataset used, in folder data, you can choose Celeba,mnist,lsun. You can also download the pictures yourself and put the folders inside the Data folder.
Input_fname_pattern \color{red}{\text{input_fname_pattern}: Type of picture entered, default to *.jpg
Checkpoint_dir \color{red}{\text{checkpoint_dir}}: Checkpoint directory name, default to checkpoint \color{navy}{\text{checkpoint}}
Sample_dir \color{red}{\text{sample_dir}}: Directory name for the generated picture, default to Samples
Train \color{red}{\text{train}}: Training is true, test is False, default is False
Crop \color{red}{\text{crop}}: Training is true, test is False, default is False
Visualize \color{red}{\text{visualize}}: Visual to True, not visual to False, default to Falsemodel.py Initialization Parameters model.py defines the Dcgan class, including 9 functions __init__ ()

Parameter initialization, already spoken input_height, input

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.