Image segmentation (3) from graph cut to grab cut

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Image segmentation (3) from graph cut to grab cut

Zouxy09@qq.com

Http://blog.csdn.net/zouxy09

In the previous article, I learned about graphcut, but the grabcut we talked about is an iterative graph cut. The grabcut Algorithm in opencv is based on "grabcut"
-Interactive foreground Extraction Using Iterated graph cuts is implemented in this article. This algorithm uses the texture (color) information and border (contrast) Information in the image, and can obtain better segmentation results with a small amount of user interaction operations. Let's take a look at the details of this paper. In addition, for the source code of grabcut implemented by opencv, see the next blog. The contact time is limited. If you have any mistakes, I hope your predecessors will correct me. Thank you.

Grabcut is a subject of Microsoft Research Institute. Its main functions are segmentation and image cutting. I personally understand that its selling points are:

(1) You only need to draw a box outside the target and frame the target, which can achieve good segmentation:

(2) If additional user interaction is added (with some pixels specified by the user as the target), the effect will be more perfect:

(3) Its border matting technology makes the target's boundary more natural and perfect:

Of course, it is also imperfect. First, there is no algorithm that can be universally applied. It is no exception. if the background is complex or the background and the target are similar, the split is not very good; second, the speed is a little slow. Of course, there are also many improvements to speed up.

OK. Then, after reading the results, we will think about how the above effects are achieved? What is the difference between it and graph cut?

(1) The target and background models of graph cut are grayscale histograms, and grab cut replaces the Gaussian mixture model GMM with RGB three channels;

(2) The energy minimization (segmentation) of graph cut is achieved at a time, while grab cut is replaced by an interactive iteration process that continuously performs segmentation estimation and model parameter learning;

(3) You need to specify the seed points of the target and background for graph cut, but grab cut only needs to provide the pixel set of the background area. That is to say, you only need to select the target in the box, so that all the pixels outside the box are regarded as the background. At this time, you can model and complete a good segmentation of GMM. That is, grab
Cut allows incomplete labeling ).

1. Color Model

We use the RGB color space, and use a K Gaussian component (like k = 5) of the full covariance GMM (Gaussian mixture model) to model the target and the background. So an extra vector exists.K
= {K1,..., kN,..., kN}, where kN is the Gaussian component corresponding to the nth pixel, kN
{1,... k }. For each pixel, either a Gaussian component of the target GMM or a Gaussian component of the background gmm.

Therefore, the energy required for the entire image is (Type 7 ):

In this example, U is a region item. Just like in the previous article, you mean that a pixel is categorized as a target or background penalty, that is, the negative logarithm of the probability that a pixel belongs to the target or background. We know that the Gaussian density model is in the following form:

Therefore, after taking the negative logarithm, it becomes the form of formula (9). The GMM ParameterθThere are three: The weight π of each Gaussian component and the mean vector of each Gaussian component.U(Because there are three RGB channels, it is three element vectors) and covariance matrixΣ(The 3x3 matrix is used because there are three RGB channels ). For example, formula (10 ). That is to say, the GMM parameter describing the target and the GMM parameter describing the background must be learned and determined. Once these three parameters are determined, after we know the RGB color value of a pixel, we can substitute the GMM of the target and the GMM of the background, then we can get the probability that the pixel belongs to the target and the background respectively, that is, the regional energy item of the gibenergy can be determined, that is, we can find the T-link weight of the graph. How can we obtain the n-link weight? That is, the boundary energy item.VHow can this problem be solved?

The boundary is similar to the graph cut mentioned earlier. It reflects the discontinuous penalty between the neighboring pixels m and n. If the pixel difference between the two neighboring regions is small, therefore, it is highly likely to belong to the same target or background. If they are very different, it means that these two pixels are likely at the edge of the target and background, the split is more likely, so the greater the difference between two neighboring pixels, the smaller the energy. In the RGB space, we use Euclidean distance (two-NORM) to measure the similarity between two pixels ). Here the parameter β is determined by the contrast of the image. As you can imagine, if the contrast of the image is low, that is, there are differences between pixels M and N, their difference | ZM-Zn | it is still relatively low, so we need to multiply by a relatively large β to enlarge this difference, and for images with high contrast, the difference between M and N in pixels belonging to the same target may be | ZM-Zn | it is still relatively high, so we need to multiply the difference by a relatively small β, this enables V to work properly when the contrast is high or low. The constant gamma is 50 (the best value obtained after the author trained 15 images ). OK. At this time, the N-link weight can be determined through formula (11). At this time, we can get the desired graph, we can split it.

2. Iterative energy minimization Segmentation Algorithm

The graph cut algorithm is minimized at a time, while grab cut is the smallest iteration. Each iteration makes the GMM parameter for target and Background Modeling better, and makes image segmentation better. We use algorithms to describe:

2.1 Initialization
(1) You can directly select a target to obtain an initial trimap T, that is, all pixels outside the box are used as background pixel TB, the pixels of the TU in the box are all "may be the target" pixels.

(2) For each pixel N in the TB, the tag α n of the initialization pixel N is 0, that is, the background pixel. For each pixel N in the TU, tag α n = 1 of the initialization pixel N, that is, the pixel that is "possibly a target.

(3) After the preceding two steps, we can obtain the pixels belonging to the target (α n = 1), and the remaining pixels belong to the background (α n = 0, at this time, we can use this pixel to estimate the GMM of the target and background. We can use the K-mean algorithm to classify the target and background pixels into K classes, that is, K Gaussian models in GMM, at this time, each Gaussian Model in GMM has some pixel sample sets. At this time, its parameter mean and covariance can be estimated by their RGB values, the weight of the Gaussian component can be determined by the ratio of the number of pixels belonging to the Gaussian component to the total number of pixels.

2.2 minimal Iteration

(1) Assign the Gaussian component in GMM to each pixel. (for example, if pixel n is the target pixel, place the RGB value of pixel N into each Gaussian component in the target GMM, the one with the highest probability is the most likely to generate N, that is, the kn Gaussian component of the pixel N ):

(2) for the given image dataZTo learn and optimize the parameters of GMM (because in step (1) We have classified which Gaussian component each pixel belongs to, then each Gaussian model has some pixel sample sets, at this time, its parameter mean and covariance can be estimated by the RGB values of these pixel samples, the weight of the Gaussian component can be determined by the ratio of the number of pixels belonging to the Gaussian component to the total number of pixels .) :

(3) segmentation estimation (a graph is created based on the energy items analyzed in 1, and the weights T-link and N-link are obtained.
Flow/min Cut Algorithm ):

(4) Repeat steps (1) to (3) until convergence. After (3) division, each pixel belongs to the target GMM or the background GMM, so the kN of each pixel changes, so the GMM also changes, therefore, each iteration will interactively optimize the GMM model and the segmentation result. In addition, because the process from step (1) To step (3) is a process of decreasing energy, it can ensure that the iteration process will converge.

(5) use border matting to perform smooth processing on the split boundary.

2.3 user editing (interaction)

(1) EDIT: manually fix some pixels as the target or background pixels, and then perform step 3 in Step 2.2 );

(2) Duplicate operation: Repeat the entire iteration algorithm. (Optional. In fact, this is the revocation function of the program or software image)

In general, the key lies in the process that the probability density function model of the target and the background and image segmentation can be iterated and optimized alternately. For more details, refer to the original article.

"Grabcut"-interactive foreground Extraction Using Iterated Graph Cuts"

Http://research.microsoft.com/en-us/um/people/ablake/papers/ablake/siggraph04.pdf

Opencv implements this algorithm (without the border matting process). Next we will explain its source code.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Image segmentation (3) from graph cut to grab cut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Image segmentation (3) from graph cut to grab cut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support