Image segmentation (2) graph cut)

Source: Internet
Author: User

Image segmentation (2) graph cut)

Zouxy09@qq.com

Http://blog.csdn.net/zouxy09

 

In the previous article, we gave an overview of the main segmentation methods. Next, let's learn about several algorithms that are of interest. Next we will learn about grab cut in the next blog. The two are graph-based segmentation methods. In addition, opencv implements grab
Cut. For more information about the source code, see blog updates. The contact time is limited. If you have any mistakes, I hope your predecessors will correct me. Thank you.

Graph Cuts is a very useful and popular Energy Optimization algorithm. It is widely used in the computer vision field in image segmentation, stereo vision, and image
Matting.

This method associates the image segmentation problem with the min cut problem of the image. First, an undirected graph G = <V, E> is used to represent the image to be split. V and E are the vertex and edge sets respectively. The graph here is slightly different from the common graph. A common graph consists of vertices and edges. If the edges have a direction, such an image is called a directed graph. Otherwise, the edges are undirected graphs and the edges are of a valid value, different edges can have different weights, representing different physical meanings. While Graph
A cuts graph has two more vertices on the basis of a common graph. These two vertices are represented by the symbol "S" and "T" respectively, collectively referred to as the terminal vertex. All other vertices must be connected to these two vertices to form a part of the edge set. Therefore, graph
Cuts has two vertices and two edges.

The first vertex and edge are:The first common vertex corresponds to each pixel in the image. The connection between each two neighboring vertices (corresponding to each two neighboring pixels in the image) is an edge. This edge is also called N-links.

The second vertex and edge are:In addition to image pixels, there are two other terminal vertices, namely, S (Source: Source, source) and T (sink: sink, and convergence ). Each common vertex is connected to the two terminal vertices to form the second edge. This edge is also called T-links.

It is the s-t graph corresponding to an image. Each pixel corresponds to a corresponding vertex in the graph, and there are also S and T vertices. There are two kinds of edges. The solid edges represent the edge N-links connected to common vertices in every two neighboring regions. The dotted edges represent the T-links of each common vertex connected to S and T. In the forward and backward scenes division, s generally indicates the foreground target, and t generally indicates the background.

In the figure, each edge has a non-negative weight we, which can also be understood as cost (cost or cost ). A cut is a subset of E in the graph. The cut cost (represented as | c |) is the sum of the weights of all edges in the subset C.

The cuts in graph cuts refers to a set of such edges. Obviously, these edge sets include the above two edges, the disconnection of all edges in the set will lead to the separation of the residual "S" and "T" graphs, so it is called "cut ". If a cut has the smallest sum of its edge ownership values, it is called the minimum cut, that is, the result of graph cutting. The Ford-rich-ksson theorem shows that the maximum stream max of the Network
The flow is equal to the min cut. Therefore, the max-flow/min-Cut Algorithm invented by boykov and Kolmogorov can be used to obtain the minimum cut of the S-t graph. This minimal cut divides the vertex of the graph into two non-Intersecting subsets S and T, where S
ε s, tε T and S ∪ T = V
. These two subsets correspond to the foreground pixel set and background pixel set of the image, which is equivalent to completing image segmentation.


That is to say, the edge weight determines the final splitting result. How can we determine the edge weight?

Image segmentation can be seen as pixel labeling. The label of the target (S-node) is set to 1, and the label of the background (t-node) is set to 0, this process can be achieved by minimizing graph cut to minimize the energy function. Obviously, the cut that occurs at the boundary between the target and the background is what we want (which is equivalent to splitting the back scene and target connection in the image ). At the same time, the energy should be the smallest. Assume that the label of the entire image (the label of each pixel) is L =
{L1, L2, LP}, where Li is 0 (background) or 1 (target ). If the image is split to L, the energy of the image can be expressed:

E (L) = AR (l) + B (l)

R (l) is the region term, B (l) is the boundary term, and A is an important factor between the region term and the boundary term, determines the impact of these factors on energy. If a is 0, only the boundary and region are considered. E (l) indicates the weight, that is, the loss function, or the energy function. The goal of graph cutting is to optimize the energy function to minimize its value.

Region item:

In this example, RP (LP) indicates the penalty for assigning label LP to pixel P, RP (LP) the weight of the energy item can be obtained by comparing the gray scale of the pixel P and the gray histogram of the given target and foreground. In other words, the pixel P belongs to the tag LP probability, I want the pixel P to be allocated as the tag LP with the highest probability. At this time, we want the minimum energy, so we generally take the negative logarithm of the probability, so the T-link weight is as follows:

RP (1) =-ln PR (IP | 'obj '); RP (0) =-ln PR (IP | 'bkg ')

The preceding two formulas show that when the gray-scale value of pixel P belongs to the target, the probability PR (IP | 'obj ') is greater than the background PR (IP | 'bkg '), then RP (1) is smaller than RP (0). That is to say, when pixel P is more likely to belong to the target, classifying P as the target will make the energy R (l) Small. If all pixels are correctly divided into targets or backgrounds, then the minimum energy is used.

Border item:

P and q are neighboring pixels, and boundary smoothing items mainly represent the boundary attribute of dividing L. B <p, q> can be resolved as discontinuous punishments between pixels p and q, generally, if p and q are more similar (such as their gray scale), B <p, q> is larger. If they are very different, B <p, q> is close to 0. In other words, if there is a small difference between two neighboring pixels, it is very likely to belong to the same target or the same background. If they are very different, it indicates that these two pixels are likely to be at the edge of the target and the background, and the split is more likely. Therefore, when the difference between the two neighboring pixels is greater, B <p, q> the smaller the value, the smaller the energy.


Now let's sum up: Our goal is to divide an image into two different parts: the target and the background. We use the image segmentation technology to achieve this. First, an image consists of vertices and edges with the edge value. Then we need to build a graph with two types of vertices, two types of vertices and two types of weights. A common vertex is composed of each pixel of the image, and an edge exists between two neighboring pixels. Its weight is determined by the "boundary smoothing energy item" mentioned above. There are also two terminal vertex S (target) and T (background), each common vertex and S are connected, that is, edge, the edge weight is determined by the "region energy item" RP (1). the weight of each edge connected to T is determined by the "region energy item" RP (0). In this way, the weights of all edges can be determined, that is, the graph is determined. At this time, you can use min
The cut algorithm is used to find the smallest cut. This min cut is the set of weights and the smallest edge. The disconnections between these edges can split the target and the background, that is, min cut corresponds to the minimization of energy. The min cut and max flow of the graph are equivalent, so you can use Max
Flow algorithm to find the min cut of the S-t graph. The current algorithms mainly include:

1) Goldberg-Tarjan

2) Ford-Fulkerson

3) Improved Algorithms for appeal

 

Weight:

Graph cut's 3x3 image segmentation: We take two seed points (that is, manually specify two pixels that belong to the target and the background respectively), and then we create a graph, the width of the edge indicates the size of the corresponding weight value, and then finds the combination of the Weight Value and the smallest edge, that is, cut in (c), which completes the image segmentation function.

For more information, see:

Interactive graph cuts for Optimal Boundary & region segmentation of objects in N-D Images
Segmentation.

In boykov
The home pages of Kolmogorov and Kolmogorov have a lot of code. Including maxflow/min-cut and stereo algorithms:

Http://pub.ist.ac.at /~ Vnk/software.html

Http://vision.csd.uwo.ca/code/

Cornell University's graphcuts research homepage also has a lot of information:

Http://www.cs.cornell.edu /~ RDZ/graphcuts.html

Image segmentation: A Survey of graph-cut methods (faliu Yi, icsai 2012)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.