Author: Wang Xianrong
This article is translated from Wikipedia. The original English address is http://en.wikipedia.org/wiki/ransac. if your English is correct, I suggest you directly refer to the original article.
Ransac is short for "random sample consensus (Consistent random sampling. It can estimate the parameters of the Mathematical Model Through iteration from a group of observation data sets containing "out-of-the-box points. It is an uncertainAlgorithm-- It has a certain probability to produce a reasonable result; in order to increase the probability, it must increase the number of iterations. This algorithm was first proposed by Fischler and bolles in 1981.
The basic assumption of ransac is:
(1) data is composed of "internal point". For example, data distribution can be interpreted using some model parameters;
(2) "off-site" means data that cannot adapt to the model;
(3) Other data is noise.
The causes of off-site points include: extreme noise values, incorrect measurement methods, and incorrect data assumptions.
Ransac also makes the following assumptions: given a set of (usually very small) inner points, there is a process that can estimate model parameters, and the model can be interpreted or applied to inner points.
Content
1 Example
2 Overview
3 Algorithm
4 parameters
5 Advantages and Disadvantages
6 Applications
7 References
8 External links
I. Example
A simple example is to find a proper 2-dimensional straight line from a group of observations. Assume that the observed data contains inner point and out point, in which the inner point is passed by a straight line, while the inner point is far away from the straight line. The simple least square method cannot find a line that adapts to the internal point. The reason is that the least square method tries its best to adapt to all points, including external points. On the contrary, ransac can come up with a model that is only calculated from the inner point, and the probability is high enough. However, ransac does not guarantee that the result is correct. To ensure that the algorithm has a reasonable probability, we must carefully select the algorithm parameters.
Left: Right of the dataset containing multiple external points: Straight Line found by ransac (external points do not affect the result)
Ii. Overview
The input of the ransac algorithm is a set of observed data, a parameterized model that can be interpreted or adapted to the observed data, and some trusted parameters.
Ransac achieves its goal by repeatedly selecting a set of Random Subsets in the data. The selected subset is assumed as inner point and verified using the following method:
1. There is a model adapted to the hypothetical inner point, that is, all unknown parameters can be calculated from the hypothetical inner point.
2. Use the model obtained in 1 to test all other data. If a point is suitable for an estimated model, it is considered as an inner point.
3. If there are enough points to be classified as hypothetical inner points, the estimated model is reasonable enough.
4. Then, re-estimate the model using the inner points of all assumptions, because it is only estimated by the inner points of the initial hypothesis.
5. Finally, evaluate the model by estimating the inner point and the error rate of the model.
This process is repeatedly executed for a fixed number of times. The model generated each time is either discarded because of too few internal points, or selected because it is better than the existing model.
Iii. Algorithms
The pseudo-code format algorithm is as follows:
Input:
Data-a group of observed data
Model -- model adapted to data
N -- Minimum number of data suitable for a model
K -- number of iterations of the Algorithm
T -- used to determine whether the data is applicable to the threshold value of the model
D -- determine whether the model applies to the number of data sets
Output:
Best_model -- the model parameter that best matches the data (if a good model is not found, null is returned)
Best_consensus_set -- estimate the data points of the Model
Best_error -- data-related estimated model error
Iterations = 0
Best_model = NULL
Best_consensus_set = NULL
Best_error = infinity
While (iterations <K)
Maybe_inliers = randomly selects N points from the dataset
Maybe_model = model parameters suitable for maybe_inliers
Consensus_set = maybe_inliers
For (each data set does not belong to the maybe_inliers point)
If (if the vertex is suitable for maybe_model and the error is smaller than T)
Add a vertex to consensus_set
If (the number of elements in consensus_set is greater than D)
I have found a good model, and now I want to test how good the model is.
Better_model = applicable to some model parameters in consensus_set
This_error = How better_model fits the measurement of these points
If (this_error <best_error)
We found a better model than the previous one. Save the model until a better model appears.
Best_model = better_model
Best_consensus_set = consensus_set
Best_error = this_error
Increase iterations
Returns best_model, best_consensus_set, best_error
Possible changes to ransac algorithms include:
(1) If a good model is found (the model has a small error rate), it will jump out of the main loop. This may save time for calculating additional parameters.
(2) Calculate this_error directly from maybe_model without re-estimating the model from consensus_set. This may save time for two model errors, but may be more sensitive to noise.
Iv. Parameters
We have to experiment with specific problems and datasets to determine the T and D parameters. However, the parameter K (number of iterations) can be inferred from the theoretical results. When we estimate model parameters, P is used to represent the probability that points randomly selected from the dataset in some iterations are intra-point points. At this time, the result model may be useful, therefore, p represents the probability that the algorithm produces useful results. W indicates the probability of selecting an inner point from the dataset each time, as shown in the following formula:
W = number of internal points/number of datasets
Generally, we do not know the value of W in advance, but we can give some robust values. Assume that N points need to be selected for the estimation model, WN It is the probability that all N points are intra-point points; 1−WN It is the probability that at least one of the N points is an out-of-office point. This shows that we have estimated a bad model from the data set. (1−WN)K Indicates that the algorithm will never select N points as the probability of inner point, which is the same as 1-P. Therefore,
1−P= (1−WN)K
We get the logarithm of the two sides of the above formula
It is worth noting that this result assumes that N points are selected independently. That is to say, after a point is selected, it may be selected again in the subsequent iteration process. This method is usually unreasonable, and the derived K value is regarded as the upper limit for selecting non-repetition points. For example, to find a suitable straight line from the dataset in, the ransac algorithm usually selects two vertices for each iteration and calculates the maybe_model line through these two points, which requires that these two points must be unique.
To get a more trusted parameter, the standard deviation or its product can be added to K. The standard deviation of K is defined:
V. Advantages and Disadvantages
Ransac has the advantage of robust model parameter estimation. For example, it can estimate high-precision parameters from a data set containing a large number of off-site points. The disadvantage of ransac is that it does not have an upper limit on the number of iterations of the calculated parameters. If the upper limit of the number of iterations is set, the obtained results may not be the optimal results, or even get incorrect results. Ransac only obtains a trusted model with a certain probability, which is proportional to the number of iterations. Another disadvantage of ransac is that it requires that the problem-related threshold be set.
Ransac can only estimate one model from a specific data set. If two (or more) models exist, ransac cannot find other models.
Vi. Application
Ransac algorithms are often used in computer vision, such as solving related problems simultaneously and estimating the basic matrix of a stereo camera.
VII. References
- Martin. fischler and Robert C.
bolles (Jun 1981 ). "random sample consensus: a paradigm for Model
fitting with applications to image analysis and automatic cartography ". comm. of the ACM 24 : 381-395. DOI: 10.1145/358669.358692.
- David. forsyth and Jean Ponce (2003 ). computer vision, a modern approach . prentice Hall. ISBN 0 to 13-085198-1.
- Richard Hartley and Andrew zisserman (2003 ). Multiple View Geometry in computer vision (2nd ed .). cambridge University Press.
- p.h. s. torr and D. w. murray( 1997 ). "The development and comparison of robust methods for estimating the fundamental matrix ". International Journal of Computer Vision 24 : 271-300. DOI: 10.1023/A: 1007927408552.
- Ondrej chum (2005 ). "Two-View Geometry estimation by random sample and consensus ". PhD thesis . http://cmp.felk.cvut.cz /~ Chum/teze/Chum-PhD.pdf
- Sunglok Choi, taemin Kim, and wonpil Yu (2009). "Performance Evaluation of ransac family ".In Proceedings of the British Machine Vision Conference (BMVC)Http://www.bmva.org/bmvc/2009/Papers/Paper355/Paper355.pdf.
8. External links
- Ransac toolbox for Matlab. A research (and didactic) oriented toolbox to calculate e the ransac Algorithm in MATLAB. It is highly writable able and contains the routines to solve a few relevant estimation problems.
- Implementation in C ++ as a generic template.
- Ransac for dummies a simple tutorial with your examples that uses the ransac toolbox for Matlab.
- 25 years of ransac Workshop
9. Remarks
In this article, we refer to Shen lejun'sArticleRandom Sampling consistency algorithm ransac source code and tutorial. Ziv Yaniv has implemented ransac using C ++. You can click here to download the sourceProgram.
However, if time permits, I plan to use C # To implement the ransac algorithm. There are two reasons:
(1) The best way to familiarize yourself with algorithms is to implement them;
(2) It is convenient for. Net comrades to use ransac.
Thank you for your patience and patience.