The grabcut algorithm is proposed by a Microsoft Institute. The algorithm needs little man-machine interaction in the operation of extracting foreground, and the result is very good.
In layman's parlance, a user user needs to use a rectangle to frame the foreground area. The algorithm is then used to iterate the segmentation. But sometimes the structure of segmentation is not ideal, the foreground and background will be mistaken, then we need artificial correction.
The specific principle of the user input a rectangle, the area outside the rectangle must be the background, the contents of the rectangle are unknown the computer will make an initialization mark on the image of our data, she will mark foreground and background pixels using Gaussian mixture model (GMM) for foreground and background modeling based on our data, GMM will learn and create a new pixel distribution. The unknown pixels can be categorized according to their relationship to the known classified pixels so that a graph is created based on the pixel distribution, and the nodes in the graph are pixel points. In addition to pixel points there are two nodes: source and sink. All foreground pixels are connected to source, and background pixels are connected to the sink. The weights of the pixels linked to the source/end (edges) are determined by the probability that they belong to the same class. The weights between two pixels are determined by the similarity of the edges or two pixels. If the color of two pixels is very different, the weight between them will be very small using the Mincut algorithm to segment the image above (this algorithm has not been heard). She will divide the image into source and sink according to the lowest cost equation. The cost equation is the sum of the weights of the minus edges. After clipping is complete, source is considered the foreground, and sink is considered a background.
Continue the above process to know the classification convergence.
Here's a brief introduction to the main functions used Grabcut
Cv2.grabcut (IMG, Mask, rect, Bgdmodel, Fgdmodel, itercount[, Mode]) img is a number of images mask is a mask image used to determine which areas are backgrounds and which areas are the foreground. Can be set to CV2. The GC_BGD defines an obvious background pixel.
Cv2. GC_FGD defines an obvious foreground (object) pixel.
Cv2. GC_PR_BGD defines a possible background pixel.
Cv2. GC_PR_FGD defines a possible foreground pixel. (Specific effect can try it yourself) RECT is a rectangle that contains a background, which is used only when using Mode==gc_init_with_rect Bdgmodel,fdgmodel is an array used internally by the algorithm and needs to create two of its own size (1, 65 The data type is an array of Np.float64. Itercount is the iteration number of the algorithm mode can be set to CV2. Gc_init_with_rect or Cv2.gc_init_with_mask. This is to determine our modified paradigm, matrix pattern, or mask pattern. In the
Matrix mode, the algorithm modifies the mask image, and in the new mask image, all pixels are divided into four categories: the background, the foreground, and possibly the background/foreground oil 4 different tag marks. Let's look at the specific examples below.
Import cv2
import numpy as NP
import Matplotlib.pyplot as Plt
img=cv2.imread (' c:/users/dell/desktop/01. JPG ')
Mask=np.zeros ((Img.shape[:2]), np.uint8)
Bgdmodel=np.zeros ((1,65), Np.float64)
fgdmodel= Np.zeros (1,65), Np.float64)
rect= (81,189,587,1041)
#这里计算了5次
cv2.grabcut (Img,mask,rect,bgdmodel, Fgdmodel,5,cv2. Gc_init_with_rect)
#关于where函数第一个参数是条件, to meet the conditions of the assigned value of 0, otherwise it is 1. If only the first argument returns the coordinates of the conditional element.
Mask2=np.where ((mask==2) | ( mask==0), 0,1). Astype (' uint8 ')
#mask2就是这样固定的
plt.subplot (1,2,1)
plt.imshow (IMG) plt.title (
' Original image ')
plt.xticks ([])
plt.yticks ([])
Plt.subplot (1,2,2)
#这里的img也是固定的.
Img=img*mask2[:,:,np.newaxis]
plt.imshow (IMG)
plt.title (' target image ')
plt.xticks ([])
plt.yticks ([])
plt.show ()
The overall idea of the algorithm is good, but a little flaw. A hand model in the original image is also mistaken for the foreground. Cannot completely blame algorithm, that thing is close to the character's hand and the color is similar, the judgment error also is reasonable. To solve this error is to think of adding tags. The procedure of the document is to impose white on the target area with PS, and the background part imposes black.