Neural Network Architecture pytorch-feed-forward neural network

Last Update:2018-10-30 Source: Internet

Author: User

Tags shuffle pytorch dataloader

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, you need to familiarize yourself with how to use pytorch to implement a feed-forward neural network. To facilitate understanding, we only use a feed-forward neural network with only one hidden layer as an example:

The source code and comments of a feed-forward neural network are as follows: This is relatively simple and we will not discuss it here.

1 class Neuralnet (NN. module): 2 def _ init _ (self, input_size, hidden_size, num_classes): 3 super (Neuralnet, self ). _ init _ () 4 self. FC1 = nn. linear (input_size, hidden_size) // input layer 5 self. relu = nn. relu () // hidden network: the ELU function is to take zero if all the elements of the Input Feature tensor are smaller than zero. 6 self. FC2 = nn. linear (hidden_size, num_classes) // output layer 7 8 def forward (self, X): 9 out = self. FC1 (x) 10 out = self. relu (out) 11 out = self. FC2 (out) 12 Return out

Next, let's take a look at how to call and use a feed-forward neural network: To improve computational efficiency, we need to prioritize GPU computing for this network. the input and hidden dimensions must be the same as the training image.

# Device configurationdevice = torch.device(‘cuda‘ if torch.cuda.is_available() else ‘cpu‘)model = NeuralNet(input_size, hidden_size, num_classes).to(device)

To train the network, you must define a loss function to describe the accuracy of the model in solving the problem. The smaller the loss, the smaller the deviation between the model result and the actual value. Here we use crossentropyloss () to calculate. Adam, which is an algorithm to optimize Random target functions based on a tiered degree. The detailed concepts and derivation will be further analyzed.

Criterion = nn. crossentropyloss () // for single-target classification problems, combined with NN. logsoftmax () and nn. nllloss () to calculate loss. optimizer = torch. optim. adam (model. parameters (), LR = learning_rate) // optimizer, set the learning speed and the model used

The next step is to train the model. This part of the training model is a bit difficult. First, let's look at the code and then explain to each function:

1 total_step = Len (train_loader) 2 for epoch in range (num_epochs): 3 For I, (images, labels) in enumerate (train_loader ): 4 # Move tensors to the configured Device 5 images = images. reshape (-1, 28*28 ). to (device) 6 labels = labels. to (device) 7 8 # Forward pass 9 outputs = model (images) 10 loss = criterion (outputs, labels) 11 12 # backward and optimize13 optimizer. zero_grad () // sets the gradient to zero, that is, the derivative of the loss about weight is changed to 0.14 loss. backward () 15 optimizer. step ()

To train the model, first convert the image matrix into a matrix unit of 25x25. Then, bind the operation parameters to a specific device.

Then there is the Forward Propagation of the network:

outputs = model(inputs)

Then, the output outputs and the previously imported labels are used as the input of the loss function to get the loss:

loss = criterion(outputs, labels)

After the loss is calculated, the loss will be returned. Note that this operation is performed only during training, and only the forward process is performed during testing.

loss.backward()

The gradient is calculated during the return loss process, and then the parameters need to be updated based on the gradient. optimizer. Step () is used to update the parameters. After optimizer. Step (), you can view the gradient and weight information of each layer from optimizer. param_groups [0] ['params.

optimizer.step()

To test this model, there is no Gradient model, which greatly reduces the memory usage and computing efficiency. In this test model, only one key statement can be used to predict the model, that is: _, predicted = torch. max (outputs. data, 1 ).

with torch.no_grad():correct = 0total = 0for images, labels in test_loader:images = images.reshape(-1, 28*28).to(device)labels = labels.to(device)outputs = model(images)_, predicted = torch.max(outputs.data, 1)total += labels.size(0)print(labels.size(0))correct += (predicted == labels).sum().item()

There is a problem here. How can we associate the trained data with predictions?
The outputs output by training is also torch. autograd. variable format. After the output is obtained (the output at the full connection layer of the Network), you also want to predict the category of the sample in the model. Torch is used here. max. Torch. the first input of max () is in tensor format, so outputs is used. data, rather than outputs, is used as the input. The second parameter 1 represents dim, that is, to take the maximum value of each row, which is actually the index with the highest probability; the third parameter loss is also torch. autograd. variable format.
　　Overall source code:

 1 import torch 2 import torch.nn as nn 3 import torchvision 4 import torchvision.transforms as transforms 5  6  7 # Device configuration 8 device = torch.device(‘cuda‘ if torch.cuda.is_available() else ‘cpu‘) 9 10 # Hyper-parameters 11 input_size = 78412 hidden_size = 50013 num_classes = 1014 #input_size = 8415 #hidden_size = 5016 #num_classes = 217 num_epochs = 518 batch_size = 10019 learning_rate = 0.00120 21 # MNIST dataset 22 train_dataset = torchvision.datasets.MNIST(root=‘../../data‘,23                                            train=True,24                                            transform=transforms.ToTensor(),25                                            download=True)26 27 test_dataset = torchvision.datasets.MNIST(root=‘../../data‘,28                                           train=False,29                                           transform=transforms.ToTensor())30 31 # Data loader32 train_loader = torch.utils.data.DataLoader(dataset=train_dataset,33                                            batch_size=batch_size,34                                            shuffle=True)35 36 test_loader = torch.utils.data.DataLoader(dataset=test_dataset,37                                           batch_size=batch_size,38                                           shuffle=False)39 40 # Fully connected neural network with one hidden layer41 class NeuralNet(nn.Module):42     def __init__(self, input_size, hidden_size, num_classes):43         super(NeuralNet, self).__init__()44         self.fc1 = nn.Linear(input_size, hidden_size)45         self.relu = nn.ReLU()46         self.fc2 = nn.Linear(hidden_size, num_classes)47 48     def forward(self, x):49         out = self.fc1(x)50         out = self.relu(out)51         out = self.fc2(out)52         return out53 54 model = NeuralNet(input_size, hidden_size, num_classes).to(device)55 56 # Loss and optimizer57 criterion = nn.CrossEntropyLoss()58 optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)59 60 # Train the model61 total_step = len(train_loader)62 for epoch in range(num_epochs):63     for i, (images, labels) in enumerate(train_loader):64         # Move tensors to the configured device65         images = images.reshape(-1, 28*28).to(device)66         labels = labels.to(device)67 68         # Forward pass69         outputs = model(images)70         loss = criterion(outputs, labels)71 72         # Backward and optimize73         optimizer.zero_grad()74         loss.backward()75         optimizer.step()76 77         if (i+1) % 100 == 0:78             print (‘Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}‘79                    .format(epoch+1, num_epochs, i+1, total_step, loss.item()))80 # Test the model81 # In test phase, we don‘t need to compute gradients (for memory efficiency)82 with torch.no_grad():83     correct = 084     total = 085     for images, labels in test_loader:86         images = images.reshape(-1, 28*28).to(device)87         labels = labels.to(device)88         outputs = model(images)89         _, predicted = torch.max(outputs.data, 1)90         total += labels.size(0)91         #print(predicted)92         correct += (predicted == labels).sum().item()93 94     print(‘Accuracy of the network on the 10000 test images: {} %‘.format(100 * correct / total))95 96 # Save the model checkpoint97 torch.save(model.state_dict(), ‘model.ckpt‘)

Daily Statement: the fear of people, do not fear.

Reference:

1 75516165

2 77479737

3 https://github.com/pytorch/tutorials

Neural Network Architecture pytorch-feed-forward neural network

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More