RANSAC algorithm Detailed

Source: Internet
Author: User

Original address: http://grunt1223.iteye.com/blog/961063

Another reference: http://www.cnblogs.com/xrwang/archive/2011/03/09/ransac-1.html
Given the coordinates of two points P1 and P2, it is determined that the line formed by these two points requires the P3 of any point in the input to determine whether it is on the line. The middle school analytic geometry knowledge tells us, judge a point in the straight line, only then it and the straight line any two points slope all can. In actual operation, the expression of straight line is calculated according to the known two points (point oblique type, intercept type and so on), then it is convenient to judge whether the P3 is on the line by the vector calculation.

The data in the production practice tends to deviate a certain amount. For example, we know the linear relationship between the two variables x and Y, y=ax+b, we want to determine the specific values of the parameters A and B. By experimenting, you can get a set of test values for x and Y. Although the theory of two unknown equations only need two sets of values can be confirmed, but due to the system error reasons, arbitrary two points calculated by the value of a and B are different. We hope that the final calculation of the theoretical model and test value of the least error. In the higher mathematics course of university, the idea of least squares is expounded in detail. A value that calculates the minimum mean variance of the partial derivative of parameter A, B, and Zero. In fact, in many cases, the least squares are synonymous with linear regression.

Unfortunately, the least squares method is only suitable for cases with small errors. Imagine this situation, if you need to extract a model from a noisy data set (for example, when only 20% of the data conforms to the model), the least squares will be inadequate. For example, a straight line (pattern) can be easily seen by the naked eye, but the algorithm is wrong.



The input to the RANSAC algorithm is a set of observational data (often with large noise or invalid points), a parameterized model for interpreting observational data, and some credible parameters. Ransac the goal by repeatedly selecting a set of random subsets in the data. The selected subset is assumed to be an insider, and is validated using the following method:

    • There is a model adapted to the hypothetical insider point, where all unknown parameters can be calculated from the hypothetical insider.
    • Use the model obtained in 1 to test all other data, and if a point applies to the estimated model, it is considered an insider point.
    • If there are enough points to be classified as hypothetical insider points, then the estimated model is reasonable enough.
    • Then, the model is re-estimated using all the hypothetical insider points (for example, using the least squares method), because it is only estimated by the initial assumption-points.
    • Finally, the model is evaluated by estimating the error rate of the insider and the model.
    • The process is repeated for a fixed number of times, and each resulting model is either discarded because it is too small, or is chosen because it is better than the existing model.


The whole process can be referred to:



About the source code of the algorithm, Ziv Yaniv once wrote a good C + + version, I added the comment in the key:
C code
  1. #include <math.h>
  2. #include "LineParamEstimator.h"
  3. Lineparamestimator::lineparamestimator (Double delta): m_deltasquared (Delta*delta) {}
  4. void Lineparamestimator::estimate (std::vector<point2d *> &data,
  5. Std::vector<double> &parameters)
  6. {
  7. Parameters.clear ();
  8. if (Data.size () <2)
  9. Return
  10. Double NX = data[1]->y-data[0]->y;
  11. Double NY = data[0]->x-data[1]->x;//the slope of the original line is k, the slope of the normal is -1/k
  12. Double norm = sqrt (nx*nx + ny*ny);
  13. Parameters.push_back (Nx/norm);
  14. Parameters.push_back (Ny/norm);
  15. Parameters.push_back (DATA[0]->X);
  16. Parameters.push_back (Data[0]->y);
  17. }
  18. void Lineparamestimator::leastsquaresestimate (std::vector<point2d *> &data,
  19. Std::vector<double> &parameters)
  20. {
  21. Double Meanx, Meany, NX, NY, Norm;
  22. Double covMat11, covMat12, covMat21, covMat22; The entries of the symmetric covarinace matrix
  23. int I, datasize = Data.size ();
  24. Parameters.clear ();
  25. if (Data.size () <2)
  26. Return
  27. Meanx = Meany = 0.0;
  28. covMat11 = covMat12 = covMat21 = covMat22 = 0;
  29. for (i=0; i<datasize; i++) {
  30. Meanx +=data[i]->x;
  31. Meany +=data[i]->y;
  32. CovMat11 +=data[i]->x * data[i]->x;
  33. covMat12 +=data[i]->x * data[i]->y;
  34. covMat22 +=data[i]->y * data[i]->y;
  35. }
  36. Meanx/=datasize;
  37. Meany/=datasize;
  38. CovMat11-= Datasize*meanx*meanx;
  39. covMat12-= Datasize*meanx*meany;
  40. covMat22-= Datasize*meany*meany;
  41. covMat21 = covMat12;
  42. if (covmat11<1e-12) {
  43. NX = 1.0;
  44. NY = 0.0;
  45. }
  46. else {//LAMDA1 is the largest eigen-value of the covariance matrix
  47. and is used to compute the eigne-vector corresponding to the smallest
  48. Eigenvalue, which isn ' t computed explicitly.
  49. Double lamda1 = (covMat11 + covMat22 + sqrt ((covmat11-covmat22) * (covmat11-covmat22) + 4*covmat12*covmat12)/2.0;
  50. NX =-covmat12;
  51. NY = lamda1-covmat22;
  52. Norm = sqrt (nx*nx + ny*ny);
  53. Nx/=norm;
  54. Ny/=norm;
  55. }
  56. Parameters.push_back (NX);
  57. Parameters.push_back (NY);
  58. Parameters.push_back (Meanx);
  59. Parameters.push_back (Meany);
  60. }
  61. BOOL Lineparamestimator::agree (std::vector<double> &parameters, point2d &data)
  62. {
  63. Double signeddistance = parameters[0]* (data.x-parameters[2]) + parameters[1]* (data.y-parameters[3]);
  64. Return ((Signeddistance*signeddistance) < m_deltasquared);
  65. }


Ransac looks for the matching code as follows:
C code
  1. Template<class T, Class s>
  2. Double Ransac<t,s>::compute (std::vector<s> &parameters,
  3. Parameteresitmator<t,s> *paramestimator,
  4. Std::vector<t> &data,
  5. int numforestimate)
  6. {
  7. Std::vector<t *> Leastsquaresestimatedata;
  8. int numdataobjects = Data.size ();
  9. int numvotesforbest =-1;
  10. int *arr = new int[numforestimate];//numforestimate represents the minimum number of points required to fit the model, which is 2 for the straight line of this example
  11. Short *curvotes = new Short[numdataobjects]; One if Data[i] agrees with the current model, otherwise zero
  12. Short *bestvotes = new Short[numdataobjects]; One if Data[i] agrees with the best model, otherwise zero
  13. There is less data objects than the minimum required for an exact fit
  14. if (Numdataobjects < numforestimate)
  15. return 0;
  16. Calculate all possible lines and find the solution with the least error. For the 100-point linear fitting, the complexity is undoubtedly large, with approximately 100*99*0.5=4950 operations. The method of randomly selecting a subset is generally used.
  17. Computeallchoices (Paramestimator,data,numforestimate,
  18. Bestvotes, Curvotes, numvotesforbest, 0, Data.size (), numforestimate, 0, arr);
  19. Compute the least squares estimate using the largest sub set
  20. for (int j=0; j<numdataobjects; J + +) {
  21. if (Bestvotes[j])
  22. Leastsquaresestimatedata.push_back (& (Data[j));
  23. }
  24. To fit the model again with the least squares method to the insider point
  25. Paramestimator->leastsquaresestimate (leastsquaresestimatedata,parameters);
  26. delete [] arr;
  27. delete [] bestvotes;
  28. delete [] curvotes;
  29. Return (double) leastsquaresestimatedata.size ()/(double) numdataobjects;
  30. }


Ransac can always find the optimal solution in the case of model determination and the maximum number of iterations allowed. In my experiment, the effect of RANSAC is far superior to the direct least squares method for datasets containing 80% errors.

What scenarios can RANSAC be used for? The most famous is the picture splicing technology. Better than the lens limit, it often takes more than one photo to shoot that huge landscape. When multiple images are synthesized, some key feature points are extracted in advance in the image to be synthesized. The study of computer vision shows that objects can be obtained by the transformation of a perspective moment (3x3 or 2x2) matrix in different visual angles. RANSAC is used to fit the parameters of the model (the values of each column of the matrix), thus identifying the same object in different photographs. Refer to:







In addition, RANSAC can also be used for error correction and object recognition in image search. , there are several straight lines are the sift of the matching algorithm, Ransac effectively identify them, and the correct model (book) with the wireframe marked out:

RANSAC algorithm Detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.