Image Segmentation (iii) from graph cut to grab cut

Last Update:2016-12-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

[Email protected]

Http://blog.csdn.net/zouxy09

The previous article made an understanding of graphcut, and now we talk about the Grabcut is the improved version of it, is the iteration of graph Cut. The grabcut algorithm in OPENCV is implemented according to the article "Grabcut"-Interactive Foreground Extraction using iterated Graph Cuts. The algorithm takes advantage of the texture (color) information and the boundary (contrast) information in the image, so as long as a small number of user interaction can get better segmentation results. Let's take a look at some of the details of this paper. Another OPENCV realization of the Grabcut source code interpretation see the next blog post. Contact time is limited, if there are errors, but also hope that your predecessors, thank you.

Grabcut is a subject of Microsoft research, the main function is to split and cutout. Personal understanding of its selling point is:

(1) You only need to draw a box outside the target, the target frame, it can complete the good segmentation:

(2) If you add additional user interaction (the user specifies that some pixels belong to the target), the effect will be more perfect:

(3) Its border matting technology will make the target segmentation boundary more natural and perfect:

Of course, it also has an imperfect place, one is not an algorithm can be universal, it is no exception, if the background is more complex or the background and the target similarity is very large, then the segmentation is not very good, second, the speed is a bit slow. Of course, there are a lot of improvements to speed up now.

OK, that looked at the effect, we will think, how the above effect is achieved? How does it differ from graph cut?

(1) The object and background of Graph cut is a gray-scale histogram, Grab cut is substituted for the RGB three-channel mixed Gaussian model GMM;

(2) The Energy minimization (segmentation) of Graph cut is achieved at one time, and the grab cut is replaced by an iterative process of continuous segmentation estimation and model parameter learning;

(3) Graph cut requires the user to specify some seed points for the target and background, but the grab cut only needs to provide a set of pixels in the background area. That means you only need to select the target, so the pixels outside the box are all backgrounds, so we can model GMM and do a good segmentation. That is, the grab cut allows incomplete labeling (incomplete labelling).

1. Color model

We use the RGB color space to model the target and the background with a K-Gaussian component (a k=5) of the full covariance GMM (mixed Gaussian model). So there is an extra vector k = {k1, ..., kn, ..., kn}, where kn is the nth pixel corresponding to which Gauss component, Kn∈{1, ... K}. For each pixel, either from a Gaussian component of the target GMM, or from a Gaussian component of the background gmm.

So the Gibbs energy for the entire image is (Formula 7):

where u is the area item, as you said in the previous article, you indicate that a pixel is categorized as a penalty for the target or background, which is the negative logarithm of the probability that a pixel belongs to a target or a background. We know that the mixed Gaussian density model is in the following form:

So the negative logarithm becomes the formula (9) that form, where GMM has three parameters θ : The weight of each Gaussian component π, the mean vector of each Gaussian component u(because there is an RGB three channels, it is three element vectors) and the covariance matrix ∑ (because there are RGB three channels, it is a 3x3 matrix). such as formula (10). This means that the three parameters of GMM describing the target and the GMM describing the background need to be determined. Once the three parameters have been determined, then we know the RGB color value of a pixel, then we can substitute the target GMM and the background gmm, we can get the probability that the pixel belongs to the target and the background, that is, the energy of the Gibbs energy field can be determined, That is, the weight of the t-link of the figure we can find out. So what about N-link's weight? Which is the boundary energy item V How to ask?

The boundary term is similar to the previous graph cut, which reflects the discontinuity of the neighborhood pixel m and N, and if the two neighborhood pixels are very small, then it is very likely that they belong to the same target or the same background, if they are very different, it means that the two pixels are likely to be at the edge of the target and background. It is more likely to be split, so the smaller the pixel difference between the two neighbors, the less energy. In the RGB space, we measure the similarity of two pixels, we use the Euclidean distance (two norm). The parameter β is determined by the contrast of the image, and it can be imagined that if the image is less contrast, that is, the difference between the pixels m and N, their difference | | zm-zn| | is still relatively low, then we need to multiply a larger beta to magnify the difference, and for high-contrast images, the difference between pixels m and N of the same target may be the same. | | zm-zn| | is still relatively high, then we need to multiply a smaller beta to reduce the difference, so that the V can work well with high contrast or low conditions. The constant Gamma is 50 (the better value is obtained by the author's training with 15 images). OK, at this time, the weight of the n-link can be determined by the formula (11), the figure we want to be available, we can split it.

2. Iterative Energy Minimization Segmentation algorithm

Graph cut algorithm is a one-time minimization, and grab cut is the smallest iteration, each iterative process makes the target and background modeling of GMM's parameters better, making the image segmentation better. We use the algorithm directly to illustrate:

2.1. Initialization
(1) The user obtains an initial trimap T by directly selecting the target, that is, the pixels outside the box are all as the background pixel TB, while the pixels of tu in the box are all pixels that are "possibly targets".

(2) for each pixel in the TB of N, initialize the pixel N of the label Αn=0, that is, the background pixels, while the tu in each pixel N, initialize the pixel N of the label Αn=1, that is, as "may be the target" pixels.

(3) After the above two steps, we can separately get some pixels belonging to the target (αn=1), the rest of the pixels belonging to the background (αn=0), then we can estimate the target and background of GMM by this pixel. We can use the K-mean algorithm to separate the target and background pixels into K class, namely, the K-Gaussian model in GMM, when each Gaussian model in GMM has a set of pixel samples, then its parameter mean and covariance can be estimated by their RGB values, The weights of the Gaussian component can be determined by the ratio of the number of pixels belonging to the Gaussian component to the total number of pixels.

2.2. Iteration Minimization

(1) The Gaussian component in GMM is allocated for each pixel (for example, pixel n is the target pixel, then the RGB value of pixel n is assigned to each Gaussian component in the target GMM, the one with the greatest probability is the most likely to generate N, or the first kn Gaussian component of pixel N):

(2) for a given image data Z, learn to optimize the parameters of GMM (because in step (1) We have to classify each pixel as which Gaussian component, then each Gaussian model has a number of pixel sample sets, At this time, its parameter mean and covariance can be estimated by the RGB values of these pixel samples, and the weights of the Gaussian component can be determined by the ratio of the number of pixels belonging to the Gaussian component to the total number of pixels. ）：

(3) Segmentation estimation (through the Gibbs Energy term analyzed in 1, a graph is created and the weights t-link and n-link are calculated and then segmented by the Max Flow/min cut algorithm):

(4) Repeat steps (1) to (3) until it converges. After (3) segmentation, each pixel belongs to the target GMM or the background gmm changes, so each pixel of the kn changed, so GMM also changed, so each iteration will interactively optimize the GMM model and segmentation results. In addition, because the process of step (1) to (3) is a process of decreasing energy, the iterative process can be guaranteed to converge.

(5) using border matting to smooth the boundary of segmentation and so on later processing.

2.3. User edit (Interactive)

(1) Edit: Artificially fixed some pixels are target or background pixels, and then perform a 2.2 step (3);

(2) Re-operation: repeats the entire iterative algorithm. (optional, actually this is the revocation of the program or software cutout)

In a word, the key is that the probability density function model of the target and the background and the image segmentation can alternate iterative optimization process. For more details, please refer to the original.

"Grabcut"-interactive Foreground Extraction using iterated Graph Cuts "

Http://research.microsoft.com/en-us/um/people/ablake/papers/ablake/siggraph04.pdf

OPENCV implements this algorithm (without the subsequent border matting process), the next article we will read its source code.

Image Segmentation (iii) from graph cut to grab cut

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Image Segmentation (iii) from graph cut to grab cut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Image Segmentation (iii) from graph cut to grab cut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support