The difference between conv1d and conv2d in Keras

Source: Internet
Author: User
Tags iterable keras


If there are errors, welcome to treatise.



My answer is that in the case of the conv2d input Channel 1, the two are no different or can be converted to each other. First, the last code that is called is the back-end code (TensorFlow, for example, can be found in tensorflow_backend.py):


x = tf.nn.convolution (
        input=x,
        Filter=kernel,
        dilation_rate= (Dilation_rate,),
        strides= (strides, ),
        padding=padding,
        Data_format=tf_data_format)


The difference is that input and filter pass different arguments, and input doesn't have to say what Filter=kernel is.



Let's go into the source code of CONV1D and conv2d to see. Their code is in layers/convolutional.py, and both inherit the base class _conv (Layer). Enter the _conv Class View code to discover the following code:


Self.kernel_size = Conv_utils.normalize_tuple (kernel_size, rank, ' kernel_size ') ...
#Middle code omission Input_dim = Input_ Shape[channel_axis]
Kernel_shape = self.kernel_size + (Input_dim, self.filters)


We assume that the size of input for conv1d is (600,300), while the input size of conv2d is (600) and the Kernel_size is 3.



Enter the Conv_utils.normalize_tuple function to see:


def normalize_tuple(value, n, name):
    """Transforms a single int or iterable of ints into an int tuple.

    # Arguments
        value: The value to validate and convert. Could an int, or any iterable
          of ints.
        n: The size of the tuple to be returned.
        name: The name of the argument being validated, e.g. "strides" or
          "kernel_size". This is only used to format error messages.

    # Returns
        A tuple of n integers.

    # Raises
        ValueError: If something else than an int/long or iterable thereof was
        passed.
    """
    if isinstance(value, int):
        return (value,) * n
    else:
        try:
            value_tuple = tuple(value)
        except TypeError:
            raise ValueError('The `' + name + '` argument must be a tuple of ' +
                             str(n) + ' integers. Received: ' + str(value))
        if len(value_tuple) != n:
            raise ValueError('The `' + name + '` argument must be a tuple of ' +
                             str(n) + ' integers. Received: ' + str(value))
        for single_value in value_tuple:
            try:
                int(single_value)
            except ValueError:
                raise ValueError('The `' + name + '` argument must be a tuple of ' +
                                 str(n) + ' integers. Received: ' + str(value) + ' '
                                 'including element ' + str(single_value) + ' of type' +
                                 ' ' + str(type(single_value)))
    return value_tuple


So the above code to get the kernel_size is the actual size of kernel, according to rank to calculate, conv1d rank for the 1,conv2d rank of 2, if it is conv1d, then the Kernel_size is (3, If it is conv2d, then get (3,3)





Input_dim = Input_shape[channel_axis]
kernel_shape = self.kernel_size + (Input_dim, self.filters)


And because the above Inputdim is the last dimension, the filter number assumes that both are 64 convolution cores. Thus, the shape of the conv1d kernel is actually:



(3,300,64)



And the shape of conv2d's kernel is actually:



(3,3,1,64)



If, for example, we set the kernel_size of the conv2d to its own tuple (3,300), then the final kernel_size returns our own set of tuples, based on the Conv_utils.normalize_tuple function, That is (3,300) then the actual shape of the conv2d is:



(3,300,1,64), that is, at this time the size of the conv1d reshape to get, both equivalent.



In other words, conv1d (kernel_size=3) is actually conv2d (kernel_size= (3,300)), of course, the input must be reshape (600,300,1), you can do conv2d convolution on multiple lines.



This can also explain why the use of conv1d in Keras can be done in natural language processing, because in natural language processing, we assume that a sequence is 600 words, the word vector of each word is 300 dimensions, then a sequence input into the network is (600,300), When I use conv1d for convolution, I actually complete the convolution directly on the sequence, when the convolution is actually convolution (3,300), and because each line is a word vector, so using conv1d (kernel_size=3) is equivalent to using a neural network to do N The characteristics of the _gram=3 were extracted. This is why using convolutional neural networks to process text can be very fast and effective.








Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.