The previous period of time to organize the blog found that their own about the three-dimensional matching part of the introduction is too little, this is what they spent a quarter of research things AH! Read from a large number of articles, basically have the source of all ran their own, but also improved a number of algorithms. Do not write will leave a regret, so plan in stereo matching this piece more Thank you blog, one for sharing, and secondly for consulting, Sunline used for memo. The article introduced in this article is CVPR2013 's "Segment-tree based cost Aggregation for Stereo Matching" article, introduces it for the following reasons:
1. It is a variant of the NLCA.
2. It is an article of CVPR.
This article also from Segment-tree's algorithm thought, the algorithm core, the algorithm effect three aspects carries on the analysis, this article source code link is (http://www.cv-foundation.org/openaccess/content_cvpr_2013)
1. Algorithmic thinkingStereo match is the problem that has been studied in the previous period, I will focus mainly on the semi-global algorithm, or with a non-global nature of the local algorithm, such as the previous blog introduction of the non-local cost aggregation, this article introduces NLCA derivative products, There's no offense to the author's meaning, huh! But segment tree is actually based on the improved version of NLCA, the algorithm of this paper is: Based on image segmentation, the NLCA subtree for each segmentation, and then according to the greedy algorithm, the segmentation of the corresponding sub-tree merging, the core of the algorithm is its complex merging process.
However, this segmentation is not a simple image segmentation, it is also the use of the minimum spanning tree (MST) idea, the image processing, at the same time, the segmentation of each of the MST tree structure is also out. Then each sub-tree as a node, on the basis of the node to continue to do an MST, so called layered MST, or more appropriate! The algorithm is based on NLCA, then what is St and Nlca better? The author gives the explanation that Nlca only makes an MST on a graph, and that the edge weights are simply derivative values of the gray-scale difference, which is not scientific enough, for example, when encountering texture-rich areas, this area can cause an error in the construction of MST, in fact, think about it, If the MST is poorly constructed, it naturally results in an inaccurate estimation of parallax values. and St considered a layered MST, a bit "from coarse to fine" meaning inside. There are diagram descriptions:
A refers to the original image, B refers to the local enlargement of the map, C refers to the weight of St, d refers to the NLCA weight map, this weight map refers to the
The contribution of the surrounding points to the red dots, the brighter the higher the weight. It can be seen clearly that, in the processing of details, ST is significantly stronger than NLCA, the performance in the St P1 point in the low texture area, the contribution to the high texture area is very low, this is more ideal state. But MST has flaws, where the green triangle points is where the flaw lies. The reason for this is that St, while generating MST, actually generates MST for each region, which makes the weights of the points to other points larger in the region and tends to be small in different regions. As for how St is done, take a look at the explanations in the next section. 2. Algorithm CoreThe core part of this article is one of the flowchart, the most puzzling is the algorithm flow, so this section focuses on my understanding of this piece. It divides the process into three parts, initialization-----the connection of aggregations,
In fact, the algorithm popular direct use of the literature "efficient graph based image segmentation" mentioned in the global segmentation method, the literature here is not too much explanation, just repeatedly stressed can go to the literature to consult this article. Here's what each step means. 1. Initialization is very simple, the edge is not considered for a while, each pixel is formed into a set of T, the set of only one pixel point. 2. Aggregation is more complex, it is necessary to sort the edges from small to large, and then traverse all the edges, and the two points accompanying the edge are either merged or not merged, the criterion of judging is whether the edge weights meet the following conditions:
If the merger, then the edge will be added to the set E ', eventually, E ' is stored in the small tree edge, because the subsequent also need to treat each small tree as a node, re-connect the small tree to become a tree, so you need to E ' from E to delete. 3. The rest is the connection, as mentioned above, the connection is to each small tree as a node, and further form a tree. The method used is similar to the above steps, the edge of E is traversed, this time there is no conditional limit, if two small trees are different to merge, until the number of E ' mid-edge is only 1 less than the number of points,
because the number of sides of a tree is 1 less than the number of nodes, this means that the full map corresponds to only one MST, note that this E ' was not re-emptied.
The process is what it says, and here are a few questions:1) The merged collection in the flowchart has always been vp,vq, so how is vp,q handled? 2) Why is the image segmented after aggregation? 3) What is the meaning of the termination condition in the inside? 4) What is the meaning of the threshold that the edge weights satisfy?
The following section is my answer: 1). To answer this question, we might as well think about the creation of MST, and you will find that there is only one difference between this flowchart and MST, which is
"When you choose Edge, you have a more judgmental condition.", the rest is really exactly the same, the author deliberately divides the flowchart into three parts to make it look complex, in fact, it can be said
"Our algorithm flow is the normal creation of the MST + edge judging condition based on Kruskal", it's funny, and it's a ubiquitous part of national papers. Back to the point, Vp,q became a connectivity, that's all, in code just write parent (Vp) = parent (VQ). Plainly, the author forgot to mention a Vp = Vp,q
2). When the MST is created, as long as the edge two endpoints belong to the connection is not the same, it is necessary to merge two connections into a connected, the inside of the connection can be understood as the sub-tree here, and St will consider the edge two points where the difference between the connection, the difference is not large merger, the difference is not merged, You understand? This will naturally divide the image into areas, because you have areas that do not merge! But, to say the truth, I think the author has always stressed that St has such excellent nature, but did not give a strict sense of the proof of what, just based on intuition, this is a bit unreliable, at least when I do parallax map, I found that disparity map does not reflect
Region SegmentationThe advantages. So I think there is a suspicion of over-packing here.
3). The Int (Tp) in the termination condition is actually a
Intra-area spacingThe definition, while k/| Tp| is an adjustment factor. This is a direct reference to a class of image segmentation methods (image segmentation based on graph representation), which defines
Inter-area spacingAnd
Intra-area spacingTwo distance measures, if the inter-area spacing is greater than the intra-area spacing, then two areas can not be merged, and vice versa, the Benquan value W in the formula, because it is from small to large arrangement, just is the meaning of inter-area spacing.
(about the image segmentation based on graph representation, there are many online blogs, you can go to the science of this knowledge.) I found the "efficient graph-based Image segmentation paper Ideas" a article, the introduction is good. )
4). If it is a different image area, the practice of MST does not have any identification ability to the region, some areas between the gap is obvious, some areas between the gap is not obvious, but the MST is non-discriminatory, but St is not, it proposed a judging condition, meet the conditions of the different areas, I can merge, not meet the merger! The threshold is to do this.
3. Algorithm effect paper gives the algorithm disparity map contrast, of course, the object of comparison is Nlca, and the method of bilateral filtering direction guided filter, from the standard data set, actually St and Nlca really gap is not very big, and the standard data set on the comparison is not very strong actual together, Often in the middlebery good evaluation algorithm, in the actual application scenario, the effect is a slag slag ... But St is a cost aggregation method, in fact, many global algorithms are accurate, it is in the Parallax refinement stage to do the article, in fact, their cost aggregation steps to get the disparity is often not good, then can be based on the St for Parallax Refinement, which is the biggest meaning of St!
4. Conclusion The study of St, mainly because it is the extension of NLCA, is a non-traditional global algorithm, and NLCA the only difference is that when the creation of the MST, St introduced a judgment condition, so that it can take into account the image of the region information. This is novel, stating that the author has read a large number of articles, and the combined ability is amazing, combining MST and image segmentation based on graph representation. Although its running time is larger than NLCA, but compared to the global algorithm, the speed is very good. But the algorithm also has some shortcomings, first of all, the algorithm on the image area information is not very rigorous, the whole image with a same judging conditions for segmentation, segmentation effect will not be good to go, and based on actual data measured, will find always "white hole" defects, which is also the introduction of regional information is not good cause. Although the details are slightly better than NLCA, the algorithm is time-consuming and has increased. The reasons cited above are also the reasons for the poor citation rate in this article.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
"Segment-tree based cost Aggregation for Stereo Matching"