Detailed pytorch batch training and optimizer comparison

Source: Internet
Author: User
Tags nets random shuffle shuffle pytorch dataloader
This article mainly introduced the detailed Pytorch batch training and the optimizer comparison, introduced in detail what is the Pytorch batch training and the Pytorch Optimizer optimizer, very has the practical value, needs the friend to consult under

First, Pytorch batch training

1. Overview

Pytorch provides a tool to wrap data in batches for training--dataloader. When you use it, you just need to convert our data first to torch tensor form, then to the torch recognizable dataset format, and then put the dataset in Dataloader.

Import Torch Import Torch.utils.data as Data torch.manual_seed (1) # set random number seed batch_size = 5 x = torch.linspace (1, 10, 10 ) y = torch.linspace (0.5, 5, 10) # Convert data to torch DataSet format Torch_dataset = Data.tensordataset (data_tensor=x, target_tensor= Y) # place torch_dataset into Dataloader loader = Data.dataloader (Dataset=torch_dataset, batch_size=batch_size, # Batch size # if D The number of samples in the ataset can not be divisible by batch_size, the last number of how much to use the number of shuffle=true, # whether Random shuffle num_workers=2, # threads to read data multithreading) for the epoch in range (3): For step, (Batch_x, batch_y) in Enumerate (loader): Print (' epoch: ', Epoch, ' | Step: ', step, ' |batch_x: ', Batch_x.numpy (), ' |batch_y ', Batch_y.numpy ()) ' "' Shuffle=true epoch:0 | step:0 |batch_x: [6.7. 2.3.  1.] |batch_y [3.  3.5 1. 1.5 0.5] epoch:0 |  Step:1 |batch_x: [9.10.  4.8.  5.] |batch_y [4.5 5.  2.4. 2.5] epoch:1 |  step:0 |batch_x: [3.4. 2.9.  |batch_y [1.5 2.] 1.4.5 5. ] epoch:1 | Step:1 |batch_x: [1.7. 8.5.  6.] |batch_y [0.5 3.5 4. 2.5 3. ] Epoch: 2 | step:0 |batch_x: [3.9. 2.6.  7.] |batch_y [1.5 4.5 1. 3.3.5] epoch:2 |  Step:1 |batch_x: [10.  4.8. 1.5.]  |batch_y [5.  2.4. 0.5 2.5] Shuffle=false epoch:0 | step:0 |batch_x: [1.2. 3.4.  5.] |batch_y [0.5 1.  1.5 2. 2.5] epoch:0 |  Step:1 |batch_x: [6.7. 8.9.  |batch_y [3].  3.5 4. 4.5 5. ] epoch:1 | step:0 |batch_x: [1.2. 3.4.  5.] |batch_y [0.5 1.  1.5 2. 2.5] epoch:1 |  Step:1 |batch_x: [6.7. 8.9.  |batch_y [3].  3.5 4. 4.5 5. ] epoch:2 | step:0 |batch_x: [1.2. 3.4.  5.] |batch_y [0.5 1.  1.5 2. 2.5] epoch:2 |  Step:1 |batch_x: [6.7. 8.9.  |batch_y [3].  3.5 4. 4.5 5. ] '''

2. Tensordataset

Classtorch.utils.data.TensorDataset (Data_tensor, Target_tensor)

The Tensordataset class is used to package samples and their labels into torch dataset,data_tensor, and Target_tensor are tensor.

3. Dataloader


Copy the Code code as follows:

Classtorch.utils.data.DataLoader (DataSet, Batch_size=1, Shuffle=false, Sampler=none,num_workers=0, collate_fn=< function Default_collate>, pin_memory=false,drop_last=false)

A dataset is an object in the Torch dataset format, Batch_size is the number of samples per batch of training, the default is shuffle indicates whether a random sample is required, and Num_workers indicates the number of threads that read the sample.

Second, Pytorch's optimizer optimizer

In this experiment, we first construct a set of data sets, convert the format and put it in Dataloader, and spare. Define a fixed structure of the default neural network, and then build a neural network for each optimizer, the difference between each neural network is only the difference between the optimizer. By recording the loss value in the training process, the optimization process of each optimizer is finally presented in the image.

Code implementation:

Import Torch Import torch.utils.data as data import torch.nn.functional as F from Torch.autograd import Variable import ma Tplotlib.pyplot as Plt torch.manual_seed (1) # set random number seed # define hyper parameter LR = 0.01 # learning Rate batch_size = 32 # Batch Size EPOCH = 12 # iterations X = Torch.unsqueeze (Torch.linspace ( -1, 1, +), dim=1) y = X.pow (2) + 0.1*torch.normal (Torch.zeros (*x.size ())) #plt. Scatt ER (x.numpy (), Y.numpy ()) #plt. Show () # Convert data to torch DataSet format Torch_dataset = Data.tensordataset (data_tensor=x, target_              tensor=y) # Place torch_dataset into Dataloader loader = Data.dataloader (Dataset=torch_dataset, Batch_size=batch_size, Shuffle=true, num_workers=2) class Net (torch.nn.Module): def __init__ (self): Super (Net, self). __init__ () Self . Hidden = torch.nn.Linear (1,) self.predict = Torch.nn.Linear (1) def forward (self, x): x = F.relu (Self.hi Dden (x)) x = self.predict (x) return x # creates a net NET_SGD = Net () Net_momentum = Net () Net_rmsprop = net () ne for each optimizer T_adam = Net () nets = [net_SGD, Net_momentum, Net_rmsprop, Net_adam] # Initialize Optimizer OPT_SGD = Torch.optim.SGD (Net_sgd.parameters (), LR=LR) Opt_momentum = Torch.optim.SGD (Net_momentum.parameters (), LR=LR, momentum=0.8) Opt_rmsprop = Torch.optim.RMSprop (net_  Rmsprop.parameters (), LR=LR, alpha=0.9) Opt_adam = Torch.optim.Adam (Net_adam.parameters (), LR=LR, betas= (0.9, 0.99)) Optimizers = [OPT_SGD, Opt_momentum, Opt_rmsprop, Opt_adam] # define loss function loss_function = Torch.nn.MSELoss () Losses_history = [[], [], [], []] # record Training when the loss value of different neural networks for the epoch in range (Epoch): Print (' epoch: ', Epoch + 1, ' Training ... ') for S TEP, (batch_x, batch_y) in Enumerate (loader): b_x = Variable (batch_x) b_y = Variable (batch_y) for net, opt, l _his in Zip (nets, optimizers, losses_history): output = net (b_x) loss = loss_function (output, b_y) opt.z Ero_grad () Loss.backward () Opt.step () L_his.append (loss.data[0]) labels = [' SGD ', ' Momentum ', ' Rmsprop ' , ' Adam '] for I, l_his in enumerate (losses_histOry): Plt.plot (L_his, label=labels[i]) plt.legend (loc= ' best ') plt.xlabel (' Steps ') plt.ylabel (' Loss ') Plt.ylim ((0, 0.2 )) Plt.show ()

Experimental results:

The results of the experiment show that the optimized effect of SGD is the worst, the speed is very slow; as an improved version of SGD, momentum performance is much better than Rmsprop and Adam are optimized. In the experiment, we compare the effect of each optimizer to decide which one to use for different optimization problems.

Iii. Other Additions

1. Python's zip function

The ZIP function accepts any number of sequences (including 0 and 1) as parameters, returning a tuple list.

x = [1, 2, 3] y = [4, 5, 6] z = [7, 8, 9] xyz = Zip (x, y, z) print xyz [(1, 4, 7), (2, 5, 8), (3, 6, 9)]  x = [1, 2, 3] x = Zip (x) print x [(1,), (2,), (3,)]  x = [1, 2, 3] y = [4, 5, 6, 7] xy = Zip (x, y) print xy [(1, 4), (2, 5), (3, 6)]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.