Pytorch style migration of "Pytorch"

Source: Internet
Author: User
Tags pytorch

Some simple applications of pytorch in deep learning are described earlier, and this section explains the use of Pytorch in style migrations. Basic Knowledge

Numpy.array ()
Converts a matrix or an object that has a __array____array__ method or sequence into a matrix.
Array.astype ()
Converts a matrix to the corresponding data type.
Tensor.squeeze ()
If you do not specify dim, the dimension of dim=1 in tensor is removed, and if you specify dim, only the dimensions of the specified dim=1 are removed.
Tensor.unsqueeze ()
Inserts a dim=1 on the specified dimension.
Tensor.type ()
Returns the data type of the tensor if no arguments are taken, otherwise the tensor is converted to the specified data type.
Tensor.mean ()
If the dimension is specified, the mean value on the dimension is computed, the tensor is returned, otherwise the global mean is computed, and float is returned.
TENSOR.MM ()
Calculates the matrix multiplication of two tensor and returns the tensor.
Tensor.clamp ()
Clamp all elements in tensor to [Min, Max]. Style Migration Combat

In fact, to achieve something very clear, is the need to merge two images together, this time need to define how to calculate the integration together. The first thing you need is that the content is similar, and then the style is similar. In this way we know what we need to do, we need to calculate the similarity of the fusion picture and the content picture, or the difference, and then reduce the difference as much as possible, and we also need to calculate the difference of the style of the fusion picture and style picture, and then reduce the difference. So that we can quantify our goals.

How do we define the difference of content? In fact, we can be very simple to think of the two pictures of each pixel to compare, that is to ask for the difference, because the simple calculation of the difference between them will have positive or negative, so we can add a square, so that the difference is positive, but also can add absolute value, but the absolute value of mathematics will destroy the function of the micro, This place does not understand also does not matter, remembers the universal is uses the square to be OK.

How do we define the difference of style? This is a difficult point. This is also the innovation point proposed in this article, introduced the Gram matrix calculation style difference. I try not to use the language of mathematics to explain, but to use popular language.

How is the gram matrix defined? First, the size of the Gram matrix is determined by the thickness of the feature graph, equal to CxC, then each Gram matrix element, that is, Gram (i, j) equals how much. The first and second layers of the feature map are taken out, so that a matrix of two MXN is obtained, and then the corresponding elements of the two matrices are multiplied and summed to get Gram (i, J), and all the elements of the same Gram can be obtained in this way. In this way, each element in the Gram can represent a combination of two-layer feature graphs, which can be defined as its style.

Author: Sherlockliao Link: https://www.jianshu.com/p/8f8fc2aa80b3 Source: Pinterest
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

Import Torch
import torch.nn as nn from
torch.autograd import Variable
import torchvision
from Torchvision import transforms, models from
PIL import Image
import argparse
import numpy as NP
import OS

Global variables, whether the GPU is used.

Use_gpu = torch.cuda.is_available ()
dtype = Torch.cuda.FloatTensor if use_gpu else torch. Floattensor

Define the load image function and convert the PIL image to tensor.

def load_image (Image_path, Transforms=none, Max_size=none, Shape=none):
    image = Image.open (image_path)
    image _size = Image.size

    If max_size is not None:
        #获取图像size, for sequence
        image_size = image.size
        # The array
        size = Np.array (image_size) converted to float. Astype (float)
        size = max_size/size * size;
        Image = Image.resize (size.astype (int), image.antialias)

    If shape is not None:
        image = Image.resize (Shape, image . LANCZOS)

    #必须提供transform. Totensor, converted to 4D Tensor
    if transforms is not None:
        image = Transforms (image). Unsqueeze (0)

    #是否拷贝到GPU
    return Image.type (Dtype)

Defines the VGG19 model, which extracts the 0,5,10,19, 28-ply convolution feature in the forward direction.

Class Vggnet (NN. Module):
    def __init__ (self):
        super (Vggnet, self). __init__ ()
        self.select = [' 0 ', ' 5 ', ' 10 ', ' 19 ', ' 28 ']
        self.vgg19 = models.vgg19 (pretrained = True). Features

    def forward (self, x):
        features = []
        #name类型为str , x is variable for
        name, and layer in Self.vgg19._modules.items ():
            x = Layer (x)
            if name in Self.select:
                Features.append (x)
        return features

Define the main function, get the content and style features corresponding to the 5 convolution layers, and calculate Content_loss and Style_loss respectively.

def main (config): #定义图像变换操作, must be defined.
    Totensor (). Transform = Transforms.compose ([Transforms. Totensor (), transforms. Normalize ((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)]) #加速content和style图像, S Tyle image resize into the same size content = Load_image (config.content, transform, max_size = config.max_size) style = Load_image (c Onfig.style, transform, shape = [Content.size (2), Content.size (3)]) #将concent复制一份作为target and need to calculate the gradient as the final output target = Variable (Content.clone (), Requires_grad = True) optimizer = Torch.optim.Adam ([target], lr = config.lr, betas=[0.5, 0 .999]) Vgg = vggnet () If Use_gpu:vgg = Vgg.cuda () for step in range (Config.total_step): #分别 Calculate 5 feature graphs target_features = Vgg (target) content_features = Vgg (Variable (content)) Style_features = V GG (Variable (style)) Content_loss = 0.0 Style_loss = 0.0 for F1, F2, F3 in Zip (Target_features, C Ontent_feaTures, Style_features): #计算content_loss Content_loss + = Torch.mean ((f1-f2) **2) n, c , h, W = f1.size () #将特征reshape成二维矩阵相乘, ask gram matrix F1 = F1.view (c, H * W) F3 = F3.view (c,
            H * W) f1 = torch.mm (F1, f1.t ()) F3 = Torch.mm (F3, f3.t ()) #计算style_loss Style_loss + = Torch.mean ((f1-f3) **2)/(c * H * w) #计算总的loss loss = Content_loss + Style_loss * confi G.style_weight #反向求导与优化 Optimizer.zero_grad () Loss.backward () Optimizer.step () I 
                   F (step+1)% Config.log_step = = 0:print (' Step [%d/%d], Content Loss:%.4f, Style Loss:%.4f ' % (step+1, Config.total_step, content_loss.data[0], style_loss.data[0])) if (step+1)% Config.sample_step = = 0: # Save the generated image denorm = Transforms. Normalize (( -2.12, -2.04, -1.80), (4.37, 4.46, 4.44)) img = tArget.clone (). CPU (). Squeeze () img = Denorm (img.data). CLAMP_ (0, 1) torchvision.utils.save_image (img , ' output-%d.png '% (step+1))

Gets the parameters from the command line.

if __name__ = = "__main__":
    parser = Argparse. Argumentparser ()
    parser.add_argument ('--content ', type=str, default= '/home/content.jpg ')
    parser.add_ Argument ('--style ', type=str, default= '/home/style.jpg ')
    parser.add_argument ('--max_size ', Type=int, default=
    parser.add_argument ('--total_step ', Type=int, default=5000)
    parser.add_argument ('--log_step ', type= int, default=10)
    parser.add_argument ('--sample_step ', Type=int, default=1000)
    parser.add_argument ('-- Style_weight ', Type=float, default=100)
    parser.add_argument ('--LR ', Type=float, default=0.003)
    config = Parser.parse_args ()
    print (config)
    Main (config)

The experimental results are as follows:




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.