ICCV2017 paper "unpaired image-to-image translation using cycle-consistent adversarial Networks" Reading notes
Title: unpaired image-to-image translation using cycle-consistent adversarial Networks
Author: Jun-yan Zhu, Taesung Park, etc., Berkeley AI (BAIR) Laboratory
project Address: https://junyanz.github.io/CycleGAN/
Brief IntroductionThis article is proposed mainly to solve a problem--image-to-image translation one-on-one training data is more difficult to obtain. In general, the generation of images is always exactly equal to a gound truth (pair), and this goundtruth is difficult to obtain (for example, style migration) in some situations. Therefore, the author hopes to realize the image conversion without the help of paired example. The image transformation based on unpaired example can be described as follows: An image that obeys the distribution x, which is distributed through a depth network, which obeys the Y distribution. The commonly used method is the GAN network, which trains a network G, so that the generated image y∗=g (x) y^*=g (x) and the original Y y distribution is the same, difficult to distinguish. However, this kind of training can cause network error, so many inputs correspond to the same output, that is, the input output does not have one by one correspondence.
The author thinks that this kind of image conversion should follow the cyclic consistency (cycle consistent): if there is a conversion g can make x-> Y, and F:y-> X, then x and y should be mutually transpose relationships. The authors trained two networks G and F, set a cyclic consistency loss so that f (g (x)) ≈x F (g (x)) \approx X and G (f (Y)) ≈y G (f (y)) \approx y, combining this loss with the opposing losses, achieves a complete unpaired Image-to-image translation. The model presented in this paper is called
Cyclegan
Model SettingsAs mentioned above, Cyclegan needs to train two networks g:x→y g:x \rightarrow Y and f:y→x f:y \rightarrow X. In contrast, there are two opposing losses dx d_x and dy d_y, where DX d_x is designed to differentiate between the image {X} \{x\} and the generated {F (y)} \{f (y)}, dy d_y the same. The goal of training is divided into two parts: the loss is to generate a consistent image, the cyclic consistency loss is to prevent the learning of G and F conflicting.
1. Fighting lossesThe general battle loss is based on the 14 article:
However, it has been pointed out that the negative logarithm can be changed to the minimum square to stabilize the training:
The final learning goals are: G∗=argmingmaxdylgan (g,dy,x,y) G^*=arg Min_gmax_{d_y}l_{gan} (g,d_y,x,y) and F∗=ARGMINFMAXDX