Deep learning Combat (a) fast understanding to achieve style migration _ depth Learning

Source: Internet
Author: User
Objective

Gatys before the great God published a use of style transfer to paint the article, so that ordinary photos with celebrity painting style, the effect is as follows:



Let a common picture with Van Gogh style, powerful.
Article Link: A Neural algorithm of artistic Style

In fact, we can also use the style of migration to achieve their own style, such as this article to achieve the style of Chinese painting. The difficulty is that many of the background and content of Chinese painting is not as obvious as Western paintings, and some only use ink, black and white can be completed. This makes the style transfer learning more difficult, but by adjusting the learning rate and loss function can also achieve a good learning effect. For example, the Chinese painting style realized in this paper:




Style Migration 1. Horse




Style Migration 2. Landscape

Isn't it interesting. Next we'll understand the implementation style migration.

Implementation style migration depends primarily on
Torch7
Loadcaffe
vgg-19 Model

Optional:
CUDA, Cudnn,opencl

Installation of the relevant platform is still quite troublesome, need protobuf, Loadcaffe, torch three-piece sets. And some trick, such as the LUA version is new enough otherwise there will be a package installation problem (luarocks), need to query the relevant installation instructions, GitHub issue and FAQ patiently resolved.

Related code collation on my github, interested students can be cloned directly to play a game.

GitHub Address: Https://github.com/TONYCHANBB/Chinese_painting-style principle

Back to principle, the author defines two loss functions: style loss and content loss, back to the original diagram of the article



Combine the style loss of figure A with the content loss of figure p to minimize the total loss function to get X

Ltotal (P⃗,a⃗,x⃗) =αlcontent (P⃗,x⃗) +βlstyle (A⃗,x⃗) L t o t a l (p→, a→, x→) =αl c o n t e N T (p→, x→) +βl s T y L e (a→, x→)
Among them, αα,ββ corresponds to two loss weight, adjust them to get different effect.

How to get two loss functions and content style rebuild it, we go back to the network structure, the author utilizes the vgg-network16 and 5 pools, without the full connection layer, the average pool is used. (at the end of the paper there are VGG network structure diagram)


For content reconstruction, the five convolution layers of the original network are used, ' conv1_1 ' (a), ' Conv2_1 ' (b), ' Conv3_1 ' (c), ' Conv4_1 ' (d) and ' Conv5_1 ' (e), namely, A, B, C, D, E, below the figure. Vgg network is mainly used to do content recognition, in practice, the author found that the use of the first three layers A, B, C has been able to achieve better content reconstruction work, D, e two layers to retain some of the characteristics of higher levels, lost some details.

For style reconstruction, different subsets of the convolution layer are used:
' Conv1_1 ' (a),
' Conv1_1 ' and ' conv2_1 ' (b),
' Conv1_1 ', ' conv2_1 ' and ' Conv3_1 ' (c),
' Conv1_1 ', ' conv2_1 ', ' conv3_1 ' and ' Conv4_1 ' (d),
' Conv1_1 ', ' conv2_1 ', ' conv3_1 ', ' conv4_1 ' and ' Conv5_1 ' (e)

This builds the network to ignore the content of the image and preserve the style.

Content loss Function:

Lcontent L c o n t e N T uses the square loss function, for each pixel loss and
Lcontent (p⃗,x⃗,l) =12∑ij (Flij−plij) 2 L c o n t e N T (p→, x→, L) = 1 2∑i J (F i J L−p i J L) 2

Flij F I J L is a characteristic representation of the J-J position of the L-l-layer I convolution, which is used to represent the content, p p is the character of the position of an image, and x x is the target image to be formed.
We can understand this, first of all, the image p of the Extract content represents P p, which can construct an X x in the position of the feature Infinity to the P p, so that the content loss function is minimal, our goal is to find this in the content of Infinity near p p of x x.
How to find it. The author creates a white noise image of X x and then uses the classic gradient descent method to find it.
The derivative of the loss function is:

∂lcontent∂flij={(FL−PL) IJ 0flij>0flij<0∂l c o n t e n t∂f i j L = {(F l−p L) I J F i J L > 0 0 F i j l < 0

Note that because Vgg uses Relu as activation layer, the derivative segment, F f is less than 0, and the derivative is 0. Lcontent L c o n t e N T is the sum of each layer loss.

Style loss function:

The style loss function is the same as the loss function of the content, but the corresponding combination of different layers is used to indicate that the author has established a gray matrix G g for each layer to indicate their characteristic association,
GLIJ=∑KFLIJFLJK G i J L =∑k F I J l F J k L
Loss of L-layer is:
Ei=14n2lm2l∑ij (Glij−alij) 2 E i = 1 4 N l 2 M l 2∑i J (G i J L−a i J L) 2
Where a A is the expression of the original image in L-layer.

The expression of the style loss function is:
Lstyle (a⃗,x⃗,l) =∑l=0lwiei l T y l e (a→, x→, l) =∑l = 0 L w i e i

Wl W L is the weight of each layer.

The derivative is:

∂EL∂FLIJ={1N2LM2L ((Fl) t (gl−al) Ji 0flij>0flij<0∂e l∂f i j L = {1 N l 2 M L 2 ((F L) t (G l−a L)) J I F I J L > 0 0 F I J L < 0

The total loss function is

Ltotal (P⃗,a⃗,x⃗) =αlcontent (P⃗,x⃗) +βlstyle (A⃗,x⃗) L t o t a l (p→, a→, x→) =αl c o n t e N T (p→, x→) +βl s T y L e (a→, x→)

is the formula given at the beginning of the text, we can minimize the loss function on it.

Notably, the optimized parameter is no longer a network of W and B, but rather an initial input of a white noise picture x x.

Understand the principle and we can do it. First of all can be GitHub on the code clone or download down, first run again, to ensure that there is no problem, understand the principle and code can modify parameters, make our own style.

Tips:
(1) Note that we also need to download the VGG model (placed under the current project), the runtime remember the path of the model to change to its current path

(2) We can adjust the parameters, change the optimization algorithm, and even the network structure, try to see whether it will get better results, and we can do the style of video transformation OH

(3) Neural style can not save the training model, each conversion style must run again, time is very long, recommended that you install the GPU TensorFlow.

(4) Li Feifei Daniel of Stanford has sent a perceptual losses for real-time Style Transfer and Super-resolution, replacing perceptual by using loss Per-pixels Loss uses pre-trained's VGG model to simplify the original loss calculation, adding a transform network to directly generate the style of the content image. Interested friends can also study, do some fun things.

Vgg-network structure:



Finally, I hope to bring you help, if you like, to a star ha, welcome to the GitHub on the optimization and interesting implementation of the push over, let's do something fun. Reference Links:

Deep learning Combat (i) fast understanding to achieve style migration

A Neural algorithm of artistic Style

Neural-style

Torch7

Loadcaffe

VGG-19 model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.