Paper Address: Link: http://pan.baidu.com/s/1kTAcP8r Password: djm4
This is an article on positioning and mapping methods for handheld Monocular cameras. It is inconsistent with the traditional method of slam that the location tracking and mapping are separated out by two separate processes.
Dual-Threading mechanism: One thread is used for robust tracking of handheld camera motion, and another thread is used to produce three-dimensional map point features from previously observed video frames. The system allows batch techniques complex calculations, but not in real-time operations.
The purpose of this article is to: do not need any template with the initial target, track the corrected handheld camera, and draw the environment map.
I. Overview of methods: 1. Tracing is separated from cartography:
The traditional Monocular slam is a necessary link between mapping and positioning, and every single frame image should be updated. The tracking robot has mileage assistance and can slow the movement of the robot.
But there are two problems for hand-held: First, there is no above two conditions, the other is the existence of data correlation (data-association errors), the third is the incremental system to bring irreversible deterioration effect.
Avoid data correlation errors and take active search (covariance-driven gating)binary inlier/outlier rejection with Joint compatibility BranchandBound or Random Sample Consensus (RANSAC). Ensure that the data matches correctly.
Trace and cartographic separation tracking is not dependent on the cartographic process, many robust tracking methods can be used, and tracing is not shared with data dependencies between cartography. Frees up the computational burden of map updates.
2. Cartography is based on keyframes, using batch techniques (Lianping Difference (bundle adjustment), slam generally called graph optimization).
After separation, there is no need for each frame to update the map, mainly focused on a small number of useful keyframes, the completion of the process only need to complete the next frame to be completed-"so that the image update cycle long, reserved for the map update algorithm time is long enough to operate large-size map."
So it is possible to choose a method with high accuracy and robustness.
-------------------------------------------------Fill----------------------------------------------------------------------- -------------------------
bundle adjustment in Wiki
30 frequently asked questions about SBA (Sparse bundle Adjustment)
Q2--What is the difference between light bundle? Assume the initial estimate of the corresponding three-dimensional coordinates of a set of corresponding point sets observed in a series of images, and an initial estimate of the viewing parameters for each image. Light Bundle Difference (BA) is a large optimization problem, including simultaneous refinement of three-dimensional structures and viewing parameters (i.e. camera posture and possible eigen-calibration and radial distortion), in order to obtain an optimal reconstruction under certain assumptions, Consider the noise associated with the observed image feature: If the image error satisfies a normal distribution with a mean of zero, then the BA is the maximum likelihood estimation. Its name "bundles" (beam) is derived from each three-dimensional feature focusing on the optical center of each camera, which is optimally tuned relative to the structure and viewing parameters. SBA uses the conventional implementation of the Levenberg-marquardt nonlinear least squares algorithm to solve sparse, large-scale optimization problems associated with BA.
--------------------------------------------------------------------------------------------------------------- ----------------------------------
3. Map initialization: A three-dimensional stereoscopic two-frame image point (five-point algorithm)
Adjust offline structure-from-motion (SfM), real-time vision Mileage meter paper Reference [9, five], the initial map is established with the use of a three-point stereo point pair, with the addition of n nearest camera points, The method of local light Lianping difference is used to build the map. This article uses the three-dimensional initialization and the occasional local beam update, the difference is to establish a long-term map, the features are often re-observed, the use of large computational capacity of the whole map optimization method.
4. Add a new point: Polar line Search
The beam is initialized with a polar-line search instead of a long 2D feature tracks.
5. A large number of points are drawn.
Two. Map description
1. The M-point feature is in the world coordinate system. The first J Point feature coordinates in the map are alsoUnit Patch Normal NJ.
2. Camera Center coordinate system. Conversion between coordinate systems:
Each keyframe exists in a four-scale pyramid. 0-Layer 640*480 original figure----third layer 80*60 (down sampling)
The pixels that make up the feature are not stored on their own, and each point feature contains a source keyframe (the first observed keyframe), and the patches corresponds to the 8x8 pixel block at the source pyramid level.
In the examples shown later the map might contain some m=2000 to 6000 points andn=40 to keyframes.
Three. Tracking 1. New frame acquisition, using the motion model to estimate the previous point.
The camera delivers 640x480 pixel YUV411 frames at 30Hz.
Extracting FAST corner points at each level of the pyramid
2. The points in the map are projected into the image according to the motion estimation of the previous point in the frame.
Pinhole camera projection geometry model.
U contains six degrees of freedom information for rotation and translation.
3. Search for a small amount of coarse-scale features from the map4. Update the camera point from the coarse match5. A large number of points in the mapping and search6. Find the final point estimate from all point matches
To find the point P in the map in the current frame, search around a fixed range of images before and after the predicted picture. You need to adjust the field of view before searching.
Affine transformation A based on the current frame and the first observation frame
{us s } correspond to horizontal and vertical pixel displacements in The patch ' s source pyramid level, and { u Span style= "FONT-FAMILY:CMMI6" >c c } correspond
to pixel displacements in the current camera frame ' s Zeroth (fullsize) pyramid level.
 
Matrix A determines which layer of pyramids will be searched.
< Span style= "FONT-FAMILY:CMMI6; Font-size:5pt ">< Span style= "FONT-FAMILY:CMMI6; Font-size:5pt ">< Span style= "FONT-FAMILY:CMMI6; Font-size:5pt ">< Span style= "FONT-FAMILY:CMMI6; Font-size:5pt ">
patch search and pose update is done twice.
The final frame pose is calculated from both coarse and fine sets of image measurements together.
/span>
7. Track quality and failure recovery
Tracking system evaluates the tracking quality at each frame: The feature that was observed successfully as the score
Fraction lower than a certain threshold is considered a trace loss, initiating a trace recovery process initialization.
Four. Mapping
1. Map InitializationFive-point stereo algorithm five-point stereo algorithm and RANSAC can estimate an essential matrix and triangulate the base map.
This initial map has a arbitrary scale and are aligned with one camera at the origin.
including user interaction, map initialisation takes around three seconds.
2. Key frame insertion and polar line searching
epipolar Search
Whether to join the evaluation: The quality of the trace, the interval between two keyframes, and the distance from the nearest 0 key points in the map.
Depth information for new map points: Single frame cannot be extracted, correspondences between the both views is established using Epipolar search:
3. Light Lianping Difference
Local Adjustment
4. Data correlation
Refinement
measurements made by the tracking system could be incorrect. This frequently happens in regions of the world containing repeated patterns. Such measurements is given low weights by the m-estimator used in bundle adjustment. If they lie in the Zeroweight region of the Tukey estimator, they is flagged as outliers. Each outlier measurement is given a ' second chance ' before Deletion:it are re-measured in the keyframe using the feature ' s Predicted position and a far tighter search region than used for tracking. If a new measurement is found, this is re-inserted into the map. Should
Such a measurement still be considered an outlier, it's permanently removed from the map.
Two measurements to make sure you join the map Five. Limitations and future work
1. Depending on the feature point, the motion speed of the image blurred, the extraction will be problematic. 2. Local minimum insertion affects the global map. 3. The initial stereo algorithm fails with inserting incorrect information into the map.
At present, understanding can only come to this level, there is a new understanding and update. Please correct me.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
"Paper Learning record" Ptam:parallel Tracking and Mapping for Small AR Workspaces