Orb-slam as Monocular slam, the precision of the frame is determined by the accuracy of the position and posture optimization between frames. So Optimization (optimization) plays a very important role in Orb-slam. This section explores the optimizations used in Orb-slam.
Orb-slam chooses G2o as the method of graph optimization, about G2O can refer to http://www.cnblogs.com/gaoxiang12/p/5304272.html.
First, why to optimize
Because the accuracy of camera calibration (cameras calibration) and tracking (tracking) is not enough. The error of the camera calibration is reflected in the reconstruction (for example, when triangulation is reconstructed), while the tracking error is reflected in the posture between the different keyframes, and in the Reconstruction (Monocular). The constant accumulation of errors will cause the posture of the rear frame to be farther away from the actual posture and ultimately limit the overall accuracy of the system.
1.1 Camera Calibration
The Monocular slam document generally assumes that the results of the camera calibration are accurate and do not take into account the error caused by this factor (presumably because many times the standard data set, the camera calibration error is considered similar). However, for a product, the calibration errors of different types of sensors are not the same, and may even vary greatly. Therefore, if you want to evaluate the accuracy of the entire system, this error must be taken into account.
1.2 Tracking
Whether in Monocular, binocular or RGBD, the tracked pose is error-prone. In Monocular slam, if there are enough corresponding points between two frames, you can either directly get the posture between two frames (as in initialization) or by solving an optimization problem (such as SOLVEPNP). Due to the uncertainty of the scale in the Monocular, the error of scale is also introduced. Since the tracking is always relative to the posture, the error in the front of a frame will be passed to the back, resulting in tracking to the final pose error is likely to be very large. In order to improve the accuracy of tracking, it can be 1. Optimal posture in local and global position; 2. Use closed-loop detection (loop closure) to optimize posture.
Second, how to optimize
2.1 Optimization of the objective function in the slam problem, the common constraints are: 1. The mapping of three-dimensional point to two-dimensional features (by projection matrix), 2. The transformation relationship between posture and posture (through three-dimensional rigid body transformation), 3. Two-dimensional feature to two-dimensional feature matching relationship (via F-matrix); 5. Other relationships (such as a similar transformation relationship in a single eye). If we can know that some of these relationships are accurate, then we can define such relationships and their corresponding residuals in g2o, and gradually reduce residuals by iteratively optimizing the posture to achieve the goal of optimizing posture and position.
2.2 Local optimization
When a new keyframe is added to convisibility graph, the author makes a local optimization near the keyframe, as shown in. The POS3 is the newly added keyframe, and its initial estimated posture has been obtained. At this time, Pos2 is the key frame connected with POS3, X2 is POS3 to see the three-dimensional point, X1 is Pos2 to see the three-dimensional points, these are part of the information, and participate in the bundle adjustment. At the same time, POS1 can also see X1, but it and POS3 no direct connection, belong to the POS3 associated with the local information, participate in bundle adjustment, but the value remains unchanged. Pos0 and X0 do not participate in bundle adjustment.
Therefore, the part that participates in the optimization is the middle red ellipse circle, in which the red represents the value is optimized, the gray represents the value to remain unchanged. (U,V) is X's two-dimensional projection point under POS, which is X's measurement under POS (measurement). The goal of optimization is to minimize the projection error.
2.3 Global optimization
In global optimization, all keyframes (except the first frame) and three-dimensional points are involved in optimization.
Optimization of position and posture of SIM3 in 2.4 closed loop
When a closed loop is detected, the posture of the two key frames of the closed-loop connection needs to be optimized by SIM3 (to make its scale consistent). The similarity transformation matrix between two frames is optimized to minimize the projection error of the two-dimensional correspondence point (feature).
As shown, POS6 and Pos2 are a possible closed loop. Optimize the $s_{6,2}$ by projecting errors between $ (u_{4,2},v_{4,2}) $ and $ (u_{4,6},v_{4,6}) $.
Optimization of position and posture on 2.5 Sim3
Scaling (scale) drift is generally occurring in Monocular slam, so optimization on Sim3 is necessary. There is one more degree of freedom relative to se3,sim3, and the goal of optimization is to correct the scaling factor, so the optimization does not include more variables (such as three-dimensional points).
The author optimizes all posture on the Sim3 when the closed loop is detected. Define the residuals on the SIM3 as follows:
$e _{i,j}=log_{sim3} (S_{ij}s_{jw}s_{iw}^{-1}) $
The initial value of the $s_{iw}$ is the transformation matrix of POS I with a scale of 1 relative to the world coordinate system. $S _{i,j}$ is the relative pose matrix (before SIM3 optimization) between POS I and Pos J, which represents the measurement between $s_{iw}$ and $s_{jw}$ (measurement). This is equivalent to the view that the local relative posture is accurate, and the global posture has accumulated error, is not accurate.
Third, summary
Personal understanding, optimization in Monocular slam needs more skills, to have a clear goal of optimization, carefully weigh the parameter selection, freedom, speed and stability.
Orb-slam (v) optimization