Directory Connections
(1) Data processing
(2) Build and customize the network
(3) Test your pictures with a well-trained model
(4) Processing of video data
(5) Pytorch source code modification to increase the CONVLSTM layer
(6) Understanding of gradient reverse transfer (backpropogate)
(total) Pytorch encounters fascinating bug Pytorch learning and use (iv)
Recently running a video processing code, its implementation with TensorFlow, now converted to use Pytorch processing, the main implementation is as follows: Read the original video, get continuous K-frame storage for each frame of picture data processing (rollover, normalization) of the data mini-batch processing
and the previous blog Pytorch (a)--data processing, as stated in, need:
1) defines how data is read.
2) According to the format of the data, rewrite the method of data processing in transforms.
3) because Torch.utils.data.DataLoader () is a batch processing of a single picture, for a continuous multi-frame picture is not good processing (may be able to use the image of the channel to store video frames), so the reading of the data is done mini-batch batch processing.
The main idea of the implementation is: first read all the video data file name, use shuffle scrambled after the Mini-batch group, and then read each group of video according to the name of the grouping, and finally processing the video batch read. Read the video
Define a file to complete the read, batch processing of the video, and call the defined transforms method to process the picture and return the data block used for the Pytorch.
Complete video reading using ImageIO and OpenCV
Its code is mainly the following:
def load_kth_data (F_name, Data_path, Image_size, L): "" ":p Aram F_name:video name:p Aram Data_path:data Path
:p Aram Image_size:image size:p Aram L:extract L frame of video:return:sequence frame of k+t len "" tokens = F_name.split () Vid_path = Os.path.join (Data_path, tokens[0] + "_uncomp.avi") vid = Imageio.get_reader
(Vid_path, "FFmpeg") # Load Video low = Int. (Tokens[1]) # Start of Video # Make sure the Len's video is than L
High = Np.min ([Int (tokens[2]), Vid.get_length ()])-L + 1 # The Len of video was equal L if (low = = high): Stidx = 0 Else: # The Len of video is Less-than L, print video path and the error for next line if
(Low >= High): Print (Vid_path) # The Len of video greater than L, and the start is the random of Low-high Stidx = Np.random.randint (Low=low, High=high) # Extract video of L len seq = Np.zeros ((image_size, image_size, L,
1), dtype= "float32") For T in Xrange (L): img = Cv2.cvtcolor (cv2.resize (vid.get_data (Stidx + t), (Image_size, image_size)), Cv2. Color_rgb2gray) seq[:,:, t] = img[:,:, None] return seq
(This note writes the estimate I do not know ^_^!)
Enter the filename, file path, no frame picture size, and frame length (l) to return an array of L-frame pictures.
complete batch processing of video using shuffle and video file names
According to the number of video data, the index is shuffle, and then according to the index corresponding to the file to complete the video data read, the main code is as follows:
def get_minibatches_idx (n, Minibatch_size, Shuffle=false): "" "
:p Aram N:len of data
:p Aram Minibatch_size : Minibatch size of data
:p Aram shuffle:shuffle The data
: Return:len of Minibatches and minibatches
"" " Idx_list = Np.arange (N, Dtype= "Int32")
# Shuffle
if shuffle:
random.shuffle (idx_list)
# Segment
minibatches = []
minibatch_start = 0 for
i in range (n//minibatch_size):
minibatches.append (idx_ List[minibatch_start:
Minibatch_start + minibatch_size])
Minibatch_start + = minibatch_size
# Processing the last batch
if (Minibatch_start! = N):
minibatches.append (idx_list[minibatch_start:])
return Zip (range (len (minibatches)), minibatches)
Enter the size of the video number, the size of the mini-batch, and whether to rearrange (shuffle), returning the serial number and index of each batch in the video.
implementing video data reading and processing in iterators
The Pytorch returns each batch that needs to be processed via an iterator and puts it into the network for training. It is therefore done in the iterative return method of data processing: The data is processed (normalized, tensor transformed, etc.) by the method in the video data read call transforms according to the Mini-batch index
The main code is as follows:
def __getitem__ (self, Index): # Read video data of Mini-batch with parallel method Ls = Np.repeat (Np.arra Y ([Self. T + self. K]), Self.batch_size, axis=0) # Video length of past and feature paths = Np.repeat (Self.root, Self.batch_size, Axi s=0) files = Np.array (self.trainfiles) [self.mini_batches[index][1]] shapes = np.repeat (Np.array ([Self.imag E_size]), Self.batch_size, axis=0) with Joblib. Parallel (n_jobs=self.batch_size) as Parallel:output = Parallel (joblib.delayed (Load_kth_data) (F, p, img_size,
L) for F, p, img_size, L in zip (files,
Paths
Shapes
Ls) # Save Batch Data Seq_batch = Np.zeros ((self.batch_size, Self.image_size, self.image_size, self. K + self. T, 1), dtype= "float32") for I in Xrange (self.batch_size): seq_batch[i] = output[i] # doing th is, so, it is consistent with all other datasets # to return a PIL Image if self.transform are not None : Seq_batch = Self.transform (seq_batch) return Seq_batch
The joblib is used. Parallel multi-threaded processing, with a faster speed. implementation of methods in data processing transforms
The main realization: tensor conversion normalize normalization processing randomhorizontalflip horizontal Flip
The implementation is relatively simple, the code is as follows:
Class Totensor (object): "" "
converts Numpy.ndarray (N x H x W x C x 1) in the range
[0, 255] to a torch. Floattensor of shape (N x H x W x C x 1).
"" " def __call__ (self, pic):
# handle NumPy array
img = Torch.from_numpy (pic)
# backard compability
return IMG
class Normalize (object): "" "would
Normalize each channel of the torch.*tensor, i.e.
channel = Channel/127.5-1 "" "
def __call__ (self, tensor):
# Todo:make efficient for
T in tensor:
t.div _ (127.5). Sub_ (1)
return tensor
class Randomhorizontalflip (object): "" "
randomly horizontally Flips the given Numpy.ndarray
(N x H x W x C x 1) with a probability of 0.5
"" "
def __call__ (Self, img):
for N in Xrange (Img.shape[0]):
if Random.random () < 0.5:
img[n] = img[n,:,::-1]
return img
Output Results
Finally, after the processing of the video image visualization, the following results are obtained: