1. Introduction
Stereo matching is the key step of three-dimensional reconstruction system, and as a non-contact measurement method, it has important application value in industry and scientific research field. In order to complete the matching work and obtain the dense disparity map of the scene, we can construct the energy function to correspond to the three-dimensional matching constraint conditions. The global optimal solution of complex energy function is usually NP-hard problem. Compared with other global optimization algorithms such as simulated annealing, gradient descent, and dynamic programming, the graph cutting algorithm is not only high precision, fast convergence speed, but also has better matching effect to illumination variation, weak texture and other regions.
2. Graph cutting Algorithm
Most of the problems in computer vision can be converted to the problem of marking, and the solution of parallax in stereo matching is the problem of discrete marking of the pixel in the inspection range. The optimal solution of discrete marking can be solved by minimizing the energy function, which is a kind of algorithm which can solve the problem of energy minimization, which is widely used in the field of computer vision, such as segmentation, image restoration, stereo matching and so on.
Kolmogorov points out how to minimize the problem of energy function and the computation of stereo parallax. Usually the stereo matching is divided into three steps using the graph cut algorithm, the network diagram is established, the graph cutting algorithm is solved, and the disparity map is generated. Graph cutting algorithm because of its global optimization characteristics can obtain a good effect of dense disparity map, but for processing high-resolution images, its operation is too large, in order to reduce the computation, the general idea is to use the segmented image to reduce the size of the network diagram and thereby reducing the computational capacity. However, due to the use of automatic non-interactive color image segmentation method will be the same disparity of the region or the partial details of the image, resulting in segmentation error, and eliminate the error need to introduce other methods, such as through the introduction of initial parallax estimation methods, but these methods increase the overall complexity of stereo matching algorithm, And there is no effective use of segmented information.
In order to obtain the fine disparity map of the area of interest in the practical application, this paper proposes a stereo matching method based on image segmentation, aiming at the shortcomings of the previous stereo matching algorithm based on image segmentation, which has large computational capacity, and does not make full use of the information of segmentation results. This method obtains the interest target by the interactive graph cut method in the image segmentation, and only the stereo matching for the object of interest, so the computation amount is greatly reduced, and the global optimal characteristic of the original graph cutting algorithm is preserved.
2.1 Energy Function
In the process of stereo matching using graph cutting algorithm, it is necessary to correspond the network Diagram and energy function in the graph cut. The energy function is defined as:
The four items of the energy function are data items, occlusion items, smoothing items, and unique items. Each item characterizes the problem to be addressed when matching.
- 1) data item :
The data item is for the algorithm to obtain the best pixel matching, the higher the color similarity between pixels, the smaller the value of the data item.
where function D (a) is used to characterize the similarity between matched pixel p,q, a = (P,Q) indicates that if the p,q pixel matches, the data item constraint takes effect and can be expanded by the following formula:
- 2) Occlusion Items
The effect of occlusion items is still to maximize the number of matched pixels, and the lower the value of the occlusion item, the less the occlusion pixels are multiplied by the penalty coefficients for the pixels that exist in the occlusion.
- 3) Smoothing Items
The main measure of smoothing is the similarity of neighboring pixels, especially color similarity, and for pixel p its adjacency pixels P1 and P2 should have the same parallax distribution.
Smoothing items generally use segmented functions. You can expand the component segment function by distance measurement. The lower the value of the smoothed item means the closer the parallax of the neighboring pixels.
- 4) Unique Items
The unique item focuses on the uniqueness constraints of stereo matching, and sets the penalty value to infinity if there is more than one match for the match point. The following formula is the mathematical expression
2.2 Network Flow
(i) Maximum flow
For a forward graph with the source point S and the meeting point T, this is called the Network Diagram. Setting f in the Network Diagram is a nonnegative function defined on the set E. With Fij, the value of f on arc e = (VI,VJ), that is, the flow of traffic from VI to VJ on arc E is called a network stream. The network stream FIJ meets the following two conditions:
1. Flow Fij does not exceed the capacity of the arc CIJ,
2. For any vertex vi, the flow from VI is equal to the flow from VI. That
One of the largest flows in all network streams that meet the above conditions is called the maximum flow.
(ii) Minimum cut
A s-t cut in the network diagram means that the vertex set is divided into two parts. The cost of the cut is the capacity of the vertex set to all cutting edges and, the capacity and the smallest cut is called the minimum cut. Set X and Y are the two vertices in the vertex set V, (x, y) represents an edge from X to Y, and the weight of its edges is expressed as C (x, y). So for Figure g= (V,E) Its one cut can be expressed as:
Ford and Fulkerson proved the equivalent correspondence between maximum and minimum cuts as early as 1962. By finding the maximum flow of the network graph to be equivalent to its minimum cut, we can obtain the global minimum of the corresponding energy function of this minimum cut. A notable work is the Energy function optimization method proposed by Boykov, which is based on graph cutting theory. In this method, the author proposes two kinds of large-scale movement of the labeling function, and expands the moving
(expansion moves) and Exchange movement (swap moves), and proves that the local small and global small difference obtained by its expansion algorithm is a known constant, and the exchange algorithm can handle more general energy function form. This article uses the expanded move algorithm.
3. Construction of stereo matching network Diagram
In the process of stereo matching using graph cutting algorithm, we need to construct the network diagram Firstly, which is composed of the forward edge of the node and the connecting node in the above mentioned grid diagram. Source point S, meeting point T is two special nodes. The edges are divided into two kinds, one parallax edge and one smoothing edge. The parallax edge corresponds to the energy function (equation (1)) 1th, and the smoothed edge corresponds to the energy function of the 2nd item.
The specific construction process for the grid diagram is as follows:
1. Establish a 3-dimensional coordinate system o-xyz, the image is placed in the Oxy plane, the origin of the x-axis, the y-axis and the oxy plane and the corresponding axis coincident.
2. On the positive half axis of the z-axis, the placement vector at equidistant from the origin is L1,l2,... Ln, place q0 at L1, where the origin is O, and for i=1,2,... n-1 Place Qi in the midpoint of Li and Li+1, and finally place qn in Ln.
At this point, the q0,q1,... qn of the pixel points p= (px,py) in the Oxy plane and the positive half axes of Z are formed into a cube grid. We can see that for i=1,2,... n-1, each interval on the z-axis [qi,qi+1] contains exactly one li+1. (P,QI) =:(Px,py,qi) is the node on the cube grid, and N (p) is the neighborhood of the pixel point p. Add two nodes at each end of the network diagram, that is, the source point S and the meeting point T. And in S to I1 each belonging to the left view split template (figure (1)) is marked as a foreground pixel between the pixels to add an edge, in the T to the set that is the cube network on the other side of the Oxy plane opposite the node, add to the edge of the meeting point. From this, get an g=< diagram of V,e > that is:
The capacity of each edge in the Network diagram is:
- (1) Source point, the capacity of the meeting point connection Edge is: the capacity of the meeting point link Edge
- (2) The capacity of the Parallax Edge is: to any, the capacity of the edge is:
In the treatment of the difference edge, the parallax edge corresponding to the energy function of the data item, both (1) type of the first item, in the color image, we separate the RGB three-channel processing, and then the weighted average, so that retains the color information, the results more accurate, special, in order to further accurate, This paper uses the linear nearest neighbor interpolation algorithm to add sub-pixel information. The above formula can be extended to:
The weights for each channel of the color image, preferably 0.29,0.11,0.58, or 0.33.
- (3) The capacity of the smooth edge: p, Q for the garment image adjacent to two pixels:
So the network diagram is built to complete:
4. Image segmentation based on graph cutting algorithm
In this paper, the graph cutting algorithm is used as the basic frame, and the method based on image segmentation is adopted to achieve the stereo matching of objects of interest. Because the color image segmentation algorithm will affect the results of the later stereo matching, it is very important to select the appropriate segmentation algorithm.
Based on the automatic non-interactive segmentation method, it is possible to separate the area of the same disparity or to hide some details of the image, which causes the error, and the elimination error needs to introduce other methods, such as the introduction of local matching algorithm to provide initial parallax estimation for the segmentation template, But these methods improve the overall complexity of the stereo matching algorithm, and the segmentation information is not used effectively. So in this paper, image segmentation based on graph cutting algorithm is used to construct stereo matching network graph and image segmentation.
In the image segmentation problem, we define the following energy function form:
The traditional image segmentation based on graph cutting algorithm will map the maximum flow/minimum cut problem of the corresponding weighted graph, which is good for the low-resolution simple image interaction segmentation, but the computational complexity is high and the memory cost is large. In order to improve the segmentation speed and be suitable for high resolution images, the image segmentation is integrated into the process of stereo matching. This article uses the method in literature [22], by adding a secondary index node, and using the New Energy function form:
Speed up segmentation and reduce the amount of computation. Formula (5) a non-normalized square diagram of the data item that represents the foreground object and background, smoothing the item
, which is the mean value for all ⊿i in the image. This method simplifies the calculation time of the graph cut and obtains the very accurate segmentation result. As shown below (the blue seed points are used to mark the background, and the red seed points are used to mark the foreground):
|
|
Baby1 left view seed point settings |
Left view split result |
|
|
Baby1 right View seed point settings |
Right view split result |
5. Graph cutting algorithm stereo matching
In the stereo matching problem, the problem of the number of Parallax maps can be equivalent to the minimization of global energy function, which is usually expressed as the Greig energy function form.
In Figure 1, the point represents the source point, the point represents the sink point, the parallax edge corresponds to the first item in the Energy function formula (1), and the smoothed edge corresponds to the second item of the energy function. The minimum value of the energy function of the equation (1) can be equivalent to the minimum cut problem of solving graph, and the global optimal disparity map is obtained.
In order to reduce the computation amount of stereo matching, this paper obtains the object of interest and the segmentation template according to the result of the image segmentation, constructs the network graph by the partition template, uses the graph cut algorithm to carry on the stereo match, uses the segmentation information effectively. In summary, the algorithm can be summed up in two major steps: 1 of the object of interest extraction, 2 using Network Diagram for stereo matching. The algorithm Flowchart 2 shows:
Figure 2 Flow Chart of the algorithm
Compared with the traditional method, the algorithm of constructing network graph based on each pixel is different. For graphs, adding the source points, respectively, at both ends, after the sink point, adds an edge only between the pixels that are marked as targets in the left view split template, and the node on the other polygon that corresponds to the plane in the T to the Cube network, and adds an edge corresponding to the sink point. Through the above method, the computational amount can be greatly reduced.
In order to further optimize the matching results, this paper applies the RGB three-channel processing to the color image, and adds subpixel information to the horizontal direction of the image by using the linear nearest neighbor interpolation algorithm. The upcoming (2) expansion is:
The weights of each channel in the color image.
According to the method above, the network diagram is constructed and the corresponding weights are assigned to each edge, and the maximal flow algorithm based on the augmented path is used to obtain the global minimum value, which is the optimal disparity matching.
Reference Documents
[16] Boykov Y, Kolmogorov v. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision[j]. IEEE Transactions on Pattern analysis and Machine Intelligence, 2004, 26 (9): 1124-1137.
[19] Roy S, Cox I J. A Maximum-flow Formulation of the N-camera stereo correspondence problem[a]//IEEE International Conference on computer Vi Sion[a], 1998 January 4-7, Bombay india:492-499.
[20] Hong L, Chen G. segment-based stereo matching using graph cuts[a]//IEEE Conference on computer Vision and Pattern Recogni tion[c],2004 June 27-july 2,washington DC usa:74-81.
[23] Tang M, Gorelick L, Veksler O, et al grabcut in one cut[a]//IEEE international Conference on Computer vision[c], De C 01-08, Sydney, Australia 1769-1776.
[24] Wang years, Fan Yi, Bowen and so on. Image matching algorithm based on graph cutting [J]. Journal of Electronics, 2006, 34 (2): 232-236.
[25] Scharstein D, Szeliski R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms[c]//stereo and Multi-baseline Vision, 2001. (SMBV 2001). Proceedings. IEEE Workshop on. IEEE, 2001:131-140.
[28] Deng Y, Yang Q, Lin X, et al. A symmetric patch-based Correspondence model for Occlusion handling[c]//PROCEEDINGS/IEEE International Conference on Co Mputer Vision. IEEE International Conference on computer Vision. 2005:1316-1322 Vol. 2.
[29] Freeman W T. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters[c]//computer Vis Ion, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 2003:900.
[30] Kolmogorov v. Graph based algorithms for scene reconstruction from or to more views[d]. Cornell University, 2004.
[31] Kolmogorov V, Zabih R. multi-camera scene reconstruction via graph Cuts[m]//computer VISION-ECCV 2002. Springer Berlin Heidelberg, 2002:82-96.
Stereo matching method based on image segmentation