3D Mapping with an rgb-d Camera (rgbd SLAM V2) paper notes

Source: Internet
Author: User

This article is the rgb-d SLAM V2, completed by Felix Endres and others in 12, is one of the first SLAM systems designed for Kinect-style sensors

Open source code found on GitHub, engineering configuration and Operation reference http://www.cnblogs.com/voyagee/p/6898278.html

System Flow:


The system is divided into front and back. The front end is the visual mileage record. Extracting features from each frame's RGB image, calculating descriptors, RANSAC+ICP calculating motion estimation between two frames,

An EMM (Environment measurement model) is proposed to determine whether estimate motion is acceptable. Back-end loopback detection, based on the G2O optimization Library's position and posture graph (pose graph) optimization,

The optimized trajectory is used to build the map. The map is octomap using the eight-tree maps.

Feature Extraction

In the source code can choose SIFT \ SURF \orb features, where SIFT to run on the GPU, ORB and SURF on the CPU with OPENCV implementation

Comparison of different features:

( wherein, ate is absolute trajectory error, trajectory error, is the system estimated trajectory and real trajectory (groundtruth) between the European distance, Rmse is the root mean square )

As you can see, sift has the best overall performance when GPU is available. Integrated real-time, hardware cost and accuracy, the orb is better.

Motion estimation

Three pairs of features quickly calculate the initial value of the RANSAC, in each iteration, the Markov distance between the least squares correspondences.

The difference between Markov distance and European distance: http://www.cnblogs.com/likai198981/p/3167928.html, it is simple to calculate the distance when considering the anisotropy, multiply a covariance matrix

Emm:environment Measurement Model

a traditional way to judge whether motion estimate is acceptable is to look at the inlier ratio, which is less than the set threshold reject motion estimate.

However, motion blur (motion Blur), the absence of texture information in the environment is prone to inlier fewer cases.

And there are points that can be seen in one frame, and another may be blocked by other points . The authors suggest using this EMM to make more robust judgments about whether reject estimate

Let's look at one hypothesis: after implementing transformation, the spatial corresponding depth measurements should come from the same surface position:

After applying transformation estimate,spatially corresponding depth measurement stem from the same underlying surface loc ation.

The author proves that the difference between the observed Yi and YJ (different yi,yj) satisfies the Gaussian distribution, and the variance is the covariance matrix of the noise (the calculation method is given by the thesis accuracy and Resolution of Kinect Depth data)

In this way, you can use the probability of the P-value test to determine whether reject estimate, however, found that the P-value test a little neurasthenia, for the small error is too sensitive,

So use a different approach

Projecting the point that camera a observes to camera B, find the inlier,outlier,occluded in observed points:

Yi and YJ should be the same point, counted as Inlier. YK no longer a field of view, so is neglected, not as "observed points".

The YQ after projection is YK blocked in the market of camera B and is not visible, so it counts as occluded.

As for the YP, fell on the YQ and camera a light heart connection, counted as outlier (note, YP and YK, YP in camera A's field of view, but camera a is here to observe the depth of YQ)

So, in the middle, Inlier has two, outlier one, occluded a


In the code (after misc.cpp 913 lines there are two functions, one of which is for the P-value test method), the author calculates inlier,outlier,occluded and determines whether reject

Observedpoints=inliers + outliers + occluded

Number of I=inlier, number of o=outlier, c=occluded quantity, if i/(i+o+c) <25%, Direct reject

Otherwise, set a threshold value for i/(I+o), which is less than the threshold reject


Loopback Detection

  Most of the current loopback detection is based on bag-of-words, deep learning is also useful after the rise of learning, or semantic information

This paper is based on the minimum spanning tree, random loopback of random forest

Use the distance between descriptors (direct subtraction) to generate a minimum spanning tree of a depth image (limited depth), removing the most recent (avoid duplication) of n time

Then randomly pick the K frame (biased toward earlier frames) from the tree to find the closed loop

Graph optimization

Only optimize the position and pose, not optimize the three-dimensional points.

Use the G2O diagram to optimize the library. Use of G2o http://www.cnblogs.com/gaoxiang12/p/3776107.html

Motion, which is not rejected by the EMM, takes a two-frame camera pose as an optimization vertex, and motion as a constraint joins the optimized edge

Detected loopback, also added to optimize vertices and edges

G2O optimization of Edge error functions:

Here the XI,XJ is the optimization variable (vertex posture estimate), Zij is the constraint, that is, the transformation between the XI and XJ. E () is how well XI and XJ satisfy the constraint,

The middle ω is the information matrix (inverse of the covariance matrix) of the constraint (optimized edge), which represents the estimation of the precision of the edge.


See here, still understand this error exactly how to calculate ~

Look at the g2o::se3edge.cpp, this e is actually the same count:

  

from->estimate (). Inverse () * to->estimate ()

Explain this error function:

We think that the position of frame 2 (from) Tj is frame 1 (to) Posture Ti is obtained by Tij, that is,

Ti * Tij = Tj

That is

Tij = Ti-1 * Tj

That is, the inverse of a pose can represent the change of two positions , so we want to express the gap between the measurement and Tij, and we can play it .

Delta: δ= measurement-1 * Ti-1 * Tj ,

This will get G2O code in the formula, the back of the TOVECTORMQT is what I do not know. It should be that the matrix is converted to vectors

Note that ti is actually Twi, W represents the world coordinate system

Here can not understand to refer to the GAO Xiang large "visual slam 14" 11.1.2

This kind of loopback detection, will inevitably appear the wrong loopback, this loopback constraint added to the graph optimization, will inevitably pull the map to distort

Therefore, after the first convergence of the graph optimization , the author removes the optimized edges corresponding to the error loopback.

How to judge the loopback is wrong? -----Set a threshold (and also a threshold) for the error function described above. ), the error is greater than the threshold value

Octomap:

Http://www.cnblogs.com/gaoxiang12/p/5041142.html

Divide a cube evenly into eight small cubes, the small cubes can be divided again, the descendants of the endless Also

Each small cube corresponds to a node of the octal tree, so it becomes a tree map of eight forks

Each small cube stores the logist regression of the probability of being occupied, the more it is observed to occupy, the greater the probability, and if no one is observed to be occupied, the node is not to be expanded

Octomap advantages: conducive to navigation; Easy to update; save space by comparing storage methods


Welcome to Exchange, welcome to add, welcome correction

3D Mapping with an rgb-d Camera (rgbd SLAM V2) paper notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.