**Bayesian cutout**

**Original address:**http://blog.csdn.net/hjimce/article/details/47667947

**Author:**HJIMCE

**first, related theories**

Many people may mix image segmentation with cutout. My point is that the image segmentation and image cutout algorithm is completely different, the image cutout algorithm is more complex, need to refer to the Alpha deserves to get the problem, of course, the accuracy is much higher than the image segmentation algorithm, of course, speed is not generally slow, so basically engineering application is difficult, The business software cutout function is realized by the algorithm of image segmentation, such as some upgraded version of grab cut algorithm.

The English word of the image cutout is also called: Matting. In the image before the scene and background color after alpha Fusion, get a picture:

where f represents the foreground color value and B represents the color value of the background. and c is the result of the final synthesis. Therefore, the image cutout is known as C, the solution F, B, alpha, this can be said to be a problem of no solution, because the unknown parameters too many, more than the number of equations, so it is difficult to pull the diagram, so Daniel came up with a variety of methods to solve this equation, nonsense not much to say.

Here I would like to talk about paper is the 2001 CVPR on the literature: "A Bayesian approach to Digital matting".

Paper's homepage is: http://grail.cs.washington.edu/projects/digital-matting/image-matting/

First look at the document cutout effect:

Feel like a good roar, even the hair can be keyed, but in fact, the calculation efficiency of the algorithm is very low, to the back you understand the algorithm, you will know how slow speed.

**second, Bayesian cutout**

A Bayesian cutout is a keyed view of an interactive input. That is, part of the image C Pixels F, B, alpha value is known, this part is our mouse input, determine the area. Then there is a part of the Pixel is F, B, the alpha value is unknown, the cutout problem boils down to: solve this part of the unknown pixels of F, B, alpha value.

Bayesian-related theory I'm not going to say it, just know that our purpose is to solve the F, B, and alpha values by the known C-values, to match the maximum probability:

The above is converted to add by multiplying, by taking the logarithm method, the multiplication is converted into addition, namely:

Also, because P (C) is a constant term, it is omitted. This machine learning algorithm: Naïve Bayesian, the derivation process should be quite familiar with, do not explain.

So, we are going to solve the optimal parameters: F, B, Alpha, which maximizes the probability of the above equation.

Recall the grab cut algorithm, the grab cut algorithm through the Gaussian mixture model, for the known foreground and background of each set up a 5 Gaussian model composed of a Gaussian mixture model. Theoretically, the Bayesian cutout is to do the same.

However, in order to ensure the coherence of space, with each unknown pixel n neighbors, clustering, for the sake of simplification, paper first assume that the foreground is only a class, the background color is only a class, remember this is only for a pixel point of N-Domain point. As for the value of N, paper selects 200, which means that each unknown pixel selects the nearest 200 for the relevant Gaussian modeling. That is to say, the 200 neighborhood points, the cluster model foreground and the background are all Gaussian models, so that we can find the probability that 200 neighborhood points belong to this Gaussian model.

**1. Gaussian model**

The following is a solution to equation 4, where the foreground and background of the pixel points in the N neighborhood are only one Gaussian model:

**(1) The first item:**

We know the formula (1), that is: c=alpha*f+ (1-alpha) *b, so we solve the parameters F, B, Alpha needs to meet the following formula:

In other words, the resulting parameters are fused:

Value, and the deviation from the true value of C, which conforms to the standard is too distributed, where σ above is the standard deviation. The above formula is the formula for solving the first term of the equation

**(2) Second item L (F)**

This is equivalent to calculating the probability that the current pixel is in the foreground. We can calculate the probability that a color value belongs to the foreground through a known pixel point entered by the user.

Continuity problem: We build the neighborhood of each pixel point (n=200, default) The weight of the pixel points is:

I is the adjacency vertex of the current pixel, and like is the alpha value of the I point.

For a class, we can calculate its weighted average color value f ', and its corresponding covariance matrix:

So the Gaussian probability for the foreground is:

**(3) Item III L (B)**

Similarly to the background, we also use the same probability calculation method. The only difference is:

Wi=

**(4) Item fourth L (Alpha)**

For transparency, paper first assumes that it is a constant constant, so that the last item of equation 4 can be omitted. Thus we can solve the partial derivative by the probability maximization equation, and make it equal to 0. Get:

Where I is a 3*3 matrix, we can get the F and B values of each unknown pixel by solving the linear equation above 6*6.

Then we can know that Alpha is calculated as follows, based on the obtained F and B, assuming that F and B are constants:

So in order to solve the maximization probability equation, our step is to fix the alpha value, then solve the equation 9 to get F, B, then fix the value of F, B, solve equation 10, get alpha value, so loop iteration. For the initialization of alpha we select the average of the alpha of the pixel realm points.

**2. N-Neighborhood multi-class model of Pixel point P**

Our hypothesis is that after the foreground and background are clustered, there is only one class, and it is conceivable that the effect is very bad, the grab cut algorithm is to assume that the foreground and background have five classes, and then based on the Gaussian mixture model. This paper, however, is not based on the Gaussian mixture model, but is composed of 22 between the clustered background and the foreground class, and then chooses a pair that maximizes the probability above, and the paper text is as follows:

Assuming that the pixel point P N neighborhood points, for modeling clustering, if the foreground and background are clustered into 5 classes, then there are 5*5 kinds of combinations, and then from the 25 results, choose the one that maximizes the probability, this each combination, will be solved once, the computational amount is very large.

Take a look at the overall process:

for (p = 0; p < tosolvelist.size (); ++p)
{
r = tosolvelist[p].y, c = tosolvelist[p].x;
Getgmmmodel (R, C, Fg_weight, Fg_mean, Inv_fg_cov, Bg_weight, Bg_mean, Inv_bg_cov);
MaxL = (float)-int_max;
for (i=0;i<bayesian_max_clus;i++) for
(j=0;j<bayesian_max_clus;j++)
{
if (! Iteration)
Initializealpha (R, c, Unsolvedmask);
else
Initializealpha (R, c, Solveagainmask);
for (Iter=0;iter<3;++iter)
{
SOLVEBF (R, C, Fg_mean[i], inv_fg_cov[i], bg_mean[j], inv_bg_cov[j]);
Solvealpha (R, c);
}
L = Computelikelihood (R, C, Fg_mean[i], inv_fg_cov[i], bg_mean[j], inv_bg_cov[j]);
if (L>MAXL)
{
MaxL = L;
Fgclus = i;
Bgclus = j;
}
}

This algorithm on the internet has a lot of source code can be downloaded, here is not to explain. This algorithm is very slow, so don't be too much of a mini-author on the homepage for some renderings.

Author: hjimce Time: 2015.8.14 contact qq:1393852684 Address: http://blog.csdn.net/hjimce Reprint please keep our information ************* *******

Reference documents:

1, "A Bayesian approach to Digital matting"