Reprint: http://blog.csdn.net/gdut2015go/article/details/48250757
In computer vision, the single response of a plane is defined as a projection map of a plane to another plane. So the mapping of a point on a two-dimensional plane to a camera imager is an example of a planar single response. If the mapping of point Q to the point Q on the imager uses homogeneous coordinates, this mapping can be represented by multiplying the matrix. If there is a definition:
Here we introduce the parameter S, which is the scale of any scale (the goal is to define the scale for the single response). It is usually placed outside of h according to habit.
H is composed of two parts: the physical transformation of the plane of the object to be observed and the projection of the parameter matrix in the camera.
The physical transformation part is the sum of the influence of the partial rotation r and the partial translational T associated with the observed image plane, which is indicated as follows:
We know that the study of ~Q is a map of a plane to another plane, then the above formula can be simplified to the ~q ' in plane coordinates, that is, we make z=0. That is, the point on the plane of the object, we use the x,y, the point on the plane of the camera, we are also represented by a two-dimensional point. We have removed the coordinates of the z-direction, so the r,r can be decomposed to R=[R1 R2 R3 relative to the rotational matrix, so R3 does not, refer to the following derivation:
OpenCV is the use of the above formula to calculate the single response matrix. It uses multiple images of the same object to compute the rotational and peaceful movement of each field of view, while also calculating the internal parameters of the camera. We know that there are 6 parameters for the rotation and peace shift, with 4 parameters in the camera. For each field of view there are 6 new parameters requiring solution and 4 invariant camera parameters. For a plane object such as a chessboard, can provide 8 variance, that is, mapping a square to a quadrilateral can be described in 4 (x,y). So for the two field of view, we have 8*2=16=2*6+4, that is, to solve all the parameters, at least two field of view.
Why is it possible to determine 8 equations for a four-point mapping of squares to quads? As a result, we assume that a vertex coordinate of a square on the object plane is (u,v), and that the imager corresponds to the point coordinates (X,Y), and we assume that the relationship between them is as follows:
Here we think of the four vertex coordinates of the square on the object plane, and we can actually understand the number of corners, and for the scale, we have s to control. For the position of corner points on the image plane, we can locate their position by looking for corner points. In fact, for the specific operation, due to not read the code and related principles, here can only be roughly guessed. After the study, and then to correct.
The single-response matrix H relates the point set position on the source image plane to the point set position of the target image plane (usually the imager plane):
OpenCV is to solve the internal parameters of the camera by using multiple field of view to compute multiple single response matrices.
OPENCV provides a convenient C function cvfindhomography (), the function interface is as follows:
void cvfindhomography (const cvmat* src_points, const cvmat* dst_points, cvmat* homography);
1, Src_points,dst_points is a Nx2 or Nx3 matrix, the NX2 representation point is expressed in pixel coordinates. The NX3 representation is expressed in homogeneous coordinates.
2, Homography, a matrix of 3*3 sizes, used to store output results.
Interface for C + + functions:
Mat findhomography (const mat& srcpoints, const mat& dstpoints,
mat& status, int method=0,
Double ransacreprojthreshold=3);
Mat findhomography (const mat& srcpoints, const mat& dstpoints,
vector<uchar>& status, int method=0,
Double ransacreprojthreshold=3);
Mat findhomography (const mat& srcpoints, const mat& dstpoints,
int method=0, double ransacreprojthreshold=3);
1, srcpoints,dstpoints for CV_32FC2 or vector<point2f> type
2. method:0 represents the conventional method of using all points; Cv_ransac robust method based on RANSAC; cv_lmeds minimum median robustness
3. The Ransacreprojthreshod is only used in the Ransac method, and a point pair is considered to be the maximum projection error allowed by the inner circumference value (not the anomaly value). That is, if:
Then point I is considered an anomaly value. If the srcpoints and dstpoints units are pixels, it usually means that in some cases this parameter ranges from 1 to 10.
4, status, optional output mask, used in Cv_ransac or Cv_lmeds method. Note that the input mask will be ignored.
This function finds and returns the perspective transformation matrix between the source image plane and the destination image plane H:
If the parameter method is set to the default value of 0, the function uses a simple least squares scheme to compute the initial single response estimate.
However, if not all of the point pairs (srcpoints,dstpoints) are adapted to this rigorous perspective transformation. (That is, there are some exception values), this initial estimate will be very poor. In this case, we can use one of the two robust algorithms. Both the Ransca and Lmeds methods try to set up a subset of different random relative pairs, each set of four pairs of points, use this subset and a simple least squares algorithm to estimate the single response matrix, and then compute the quality quality/goodness of the single response matrix. (For the Ransac method is the number of inner circumference points, for lmeds is the middle of the heavy projection error). The best subset is then used to generate initialization estimates and inliers/outliers masks for the single-response matrices.
Ignoring the method, robustness or not, the calculated single response matrix uses the Levenberg-marquardt method to further reduce the weight projection error and further purify it. (For robustness, only the inner-layer point (inliers) is used.)
The Ransac method can handle almost any exception-value ratio, but it requires a threshold value to distinguish between inliers and outliers. The Lmeds method does not require any thresholds, but it can work correctly only if the inliers is greater than 50%. Finally, if you are certain that the feature point you are calculating contains only small noises, but no outliers, the default method may be the best choice. (therefore, when calculating camera parameters, we may only use the default method)
This function is used to find the initialization internal and external parameter matrices. The h33=1 matrix depends on a scale, then it is usually normalized to make the case of the single.