Deep Learning paper notes--depth Map prediction from a single Image using a multi-scale depth Network

Source: Internet
Author: User

Reading time: 2015.04.26
Paper Source: NIPS2014
Author and organization: David Eigen [email protected]
Christian Puhrsch [email protected]
Rob Fergus [email protected]
New York University

Main content: Through the CNN to carry on the depth estimate to the single picture, the single picture carries on the depth information estimate to do the person relatively few, generally is uses the binocular camera to do. Here, the author treats CNN as a black box, using CNN to directly learn the mapping of images to its depth image information.

Innovative points:
    1. As the article title says, compared to the traditional CNN, it is multi-scale, (in fact, two scales, a coarse-grained, a fine-grained), this innovative point is a little weak.
    2. For the second innovation, it takes a new form of loss function, plus something similar to a regularization term.

Nothing else, read this article is mainly because I do now is to use the image to do the regression, to see what the loss function on the side of the new things do not have.

Network structure:


Here, let's analyze its network structure:

  1. Two levels of structure, the first network is made up of a network structure in addition to the granularity, its input is 304 < Span style= "LEFT:0EM; Top: -3.97em; Position:absolute; Clip:rect (3.32em, 1000em, 4.13em, -0.33em); " >   Span style= "width:0px; Height:0.73em; Overflow:hidden; Vertical-align: -0.05em; Border-left-color:currentcolor; Border-left-width:0em; Border-left-style:solid; Display:inline-block; " > 228 of the size of the picture, and its output, is the original image size of about 1/16, this can be specified, because it is the design of the full-attached layer of the dimension. This network can be trained by ground true images.
  2. The second network structure is a fine-grained network, but it is also a function of the original input image, and the second layer of the convolutional layer is added to the output of the first network. The second network does not have an all-connected layer, which belongs to the full convolution network.
Loss function:
  1. Loss function, but also the evaluation criteria for its results, this is the second innovation point of the article, the loss function is: wherein,
  2. Second item < Span style= "LEFT:0EM; Top: -3.97em; Position:absolute; Clip:rect (3.37em, 1000em, 4.15em, -0.45em); " >α   Span style= "width:0px; Height:0.69em; Overflow:hidden; Vertical-align: -0.08em; Border-left-color:currentcolor; Border-left-width:0em; Border-left-style:solid; Display:inline-block; " > Represents an average error term, the first part of the preceding section represents the error between each pixel, the second item is added to the first item as a whole, can make the average error at the same time to meet the small error of each pixel is also small, equivalent to a penalty.
Experimental results:


Deep Learning paper notes--depth Map prediction from a single Image using a multi-scale depth Network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.