Pytorch (iv)--processing of video data

Source: Internet
Author: User
Tags shuffle zip pytorch

Directory Connections
(1) Data processing
(2) Build and customize the network
(3) Test your pictures with a well-trained model
(4) Processing of video data
(5) Pytorch source code modification to increase the CONVLSTM layer
(6) Understanding of gradient reverse transfer (backpropogate)
(total) Pytorch encounters fascinating bug Pytorch learning and use (iv)

Recently running a video processing code, its implementation with TensorFlow, now converted to use Pytorch processing, the main implementation is as follows: Read the original video, get continuous K-frame storage for each frame of picture data processing (rollover, normalization) of the data mini-batch processing

and the previous blog Pytorch (a)--data processing, as stated in, need:
1) defines how data is read.
2) According to the format of the data, rewrite the method of data processing in transforms.
3) because Torch.utils.data.DataLoader () is a batch processing of a single picture, for a continuous multi-frame picture is not good processing (may be able to use the image of the channel to store video frames), so the reading of the data is done mini-batch batch processing.

The main idea of the implementation is: first read all the video data file name, use shuffle scrambled after the Mini-batch group, and then read each group of video according to the name of the grouping, and finally processing the video batch read. Read the video

Define a file to complete the read, batch processing of the video, and call the defined transforms method to process the picture and return the data block used for the Pytorch.

Complete video reading using ImageIO and OpenCV
Its code is mainly the following:

def load_kth_data (F_name, Data_path, Image_size, L): "" ":p Aram F_name:video name:p Aram Data_path:data Path

    :p Aram Image_size:image size:p Aram L:extract L frame of video:return:sequence frame of k+t len "" tokens = F_name.split () Vid_path = Os.path.join (Data_path, tokens[0] + "_uncomp.avi") vid = Imageio.get_reader
    (Vid_path, "FFmpeg") # Load Video low = Int. (Tokens[1]) # Start of Video # Make sure the Len's video is than L
        High = Np.min ([Int (tokens[2]), Vid.get_length ()])-L + 1 # The Len of video was equal L if (low = = high):  Stidx = 0 Else: # The Len of video is Less-than L, print video path and the error for next line if
        (Low >= High): Print (Vid_path) # The Len of video greater than L, and the start is the random of Low-high  Stidx = Np.random.randint (Low=low, High=high) # Extract video of L len seq = Np.zeros ((image_size, image_size, L,
  1), dtype= "float32")  For T in Xrange (L): img = Cv2.cvtcolor (cv2.resize (vid.get_data (Stidx + t), (Image_size, image_size)), Cv2. Color_rgb2gray) seq[:,:, t] = img[:,:, None] return seq

(This note writes the estimate I do not know ^_^!)
Enter the filename, file path, no frame picture size, and frame length (l) to return an array of L-frame pictures.

complete batch processing of video using shuffle and video file names
According to the number of video data, the index is shuffle, and then according to the index corresponding to the file to complete the video data read, the main code is as follows:

def get_minibatches_idx (n, Minibatch_size, Shuffle=false): "" "
        :p Aram N:len of data
        :p Aram Minibatch_size : Minibatch size of data
        :p Aram shuffle:shuffle The data
        : Return:len of Minibatches and minibatches

        "" " Idx_list = Np.arange (N, Dtype= "Int32")

        # Shuffle
        if shuffle:
            random.shuffle (idx_list)

        # Segment
        minibatches = []
        minibatch_start = 0 for
        i in range (n//minibatch_size):
            minibatches.append (idx_ List[minibatch_start:
                                        Minibatch_start + minibatch_size])
            Minibatch_start + = minibatch_size

        # Processing the last batch
        if (Minibatch_start! = N):
            minibatches.append (idx_list[minibatch_start:])

        return Zip (range (len (minibatches)), minibatches)

Enter the size of the video number, the size of the mini-batch, and whether to rearrange (shuffle), returning the serial number and index of each batch in the video.

implementing video data reading and processing in iterators

The Pytorch returns each batch that needs to be processed via an iterator and puts it into the network for training. It is therefore done in the iterative return method of data processing: The data is processed (normalized, tensor transformed, etc.) by the method in the video data read call transforms according to the Mini-batch index

The main code is as follows:

def __getitem__ (self, Index): # Read video data of Mini-batch with parallel method Ls = Np.repeat (Np.arra Y ([Self. T + self. K]), Self.batch_size, axis=0) # Video length of past and feature paths = Np.repeat (Self.root, Self.batch_size, Axi s=0) files = Np.array (self.trainfiles) [self.mini_batches[index][1]] shapes = np.repeat (Np.array ([Self.imag E_size]), Self.batch_size, axis=0) with Joblib. Parallel (n_jobs=self.batch_size) as Parallel:output = Parallel (joblib.delayed (Load_kth_data) (F, p, img_size,
                                                                                                L) for F, p, img_size, L in zip (files,
                                                                                                Paths
                                                                                                Shapes
  Ls) # Save Batch Data      Seq_batch = Np.zeros ((self.batch_size, Self.image_size, self.image_size, self. K + self. T, 1), dtype= "float32") for I in Xrange (self.batch_size): seq_batch[i] = output[i] # doing th is, so, it is consistent with all other datasets # to return a PIL Image if self.transform are not None : Seq_batch = Self.transform (seq_batch) return Seq_batch

The joblib is used. Parallel multi-threaded processing, with a faster speed. implementation of methods in data processing transforms

The main realization: tensor conversion normalize normalization processing randomhorizontalflip horizontal Flip

The implementation is relatively simple, the code is as follows:

Class Totensor (object): "" "
    converts Numpy.ndarray (N x H x W x C x 1) in the range
    [0, 255] to a torch. Floattensor of shape (N x H x W x C x 1).

    "" " def __call__ (self, pic):
        # handle NumPy array
        img = Torch.from_numpy (pic)
        # backard compability
        return IMG


class Normalize (object): "" "would
    Normalize each channel of the torch.*tensor, i.e.
    channel = Channel/127.5-1 "" "

    def __call__ (self, tensor):
        # Todo:make efficient for
        T in tensor:
            t.div _ (127.5). Sub_ (1)
        return tensor

class Randomhorizontalflip (object): "" "
    randomly horizontally Flips the given Numpy.ndarray
    (N x H x W x C x 1) with a probability of 0.5
    "" "

    def __call__ (Self, img): 
  for N in Xrange (Img.shape[0]):
            if Random.random () < 0.5:
                img[n] = img[n,:,::-1]
        return img
Output Results

Finally, after the processing of the video image visualization, the following results are obtained:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.