Moving target detection (foreground background separation) is considered as an introductory basis for video analysis learning, divided into pixel-based methods and texture-based methods, texture-based methods mainly refer to the LBP and SILTP described in the previous section, where we focus on the pixel method, The pixel method is the most common and intuitive method.
The basis of the pixel method hypothesis is the background modeling, that is, the model of the background pixel, which conforms to the model's pixel judgment as the background and further updates the background as the new input, does not conform to the pixel point of the model as the foreground (i.e. moving target), the mainstream foreground detection method includes: Static difference, codebook method, Gaussian background modeling, vibe method.
• Static differential
With the first frame of the video or manually selecting the scene frame that does not exist in the video as the reference frame, it is simple to subtract the reference frame from the video frame to get the moving target area. The advantage of this method is that it is easy to understand, the disadvantage is not to describe the update of the scene, when large changes occur (such as light, jitter, large background changes) results will be very large error.
The static difference can be expressed as:
Where Pi,j is the original video pixel value, Refi,j is the reference frame, the subi,j is the difference result, T is the differential threshold value.
Code Book Method
Algorithm for each pixel point in the image to establish a codebook, each code can include multiple code elements (corresponding threshold range), in the learning phase, the current pixel to match, if the pixel value in a code element of the learning threshold, that is, the previous occurrence of a certain historical situation deviation is not small, it is considered that the pixel meets the background characteristics , the learning thresholds and detection thresholds for the corresponding points need to be updated.
If the new pixel value does not match each code element, it may be due to a dynamic background, in which case we need to create a new code element for it. Each pixel is adapted to a complex dynamic background by corresponding multiple code elements.
At the time of application, select K frames at intervals to establish the codebook background model through the update algorithm, and to delete unused code elements for more than a period of time.
• Gaussian background modeling (GMM)
Gaussian background model is a classical adaptive background modeling method proposed by Stauffer and other people, assuming that each pixel conforms to the normal distribution in the time domain, the pixels within a certain threshold range are judged as the background, and the pixels that do not conform to the distribution are the foreground.
The Gaussian background model adapts to the change of the scene by updating the model to realize the background learning effect.
algorithm steps:
STEP1: Initialize the background model, initial mean, initial standard deviation, initial differential threshold T (default value 20):
STEP2: The detection pixel ix,y belongs to the foreground or the background, in which is the threshold parameter, the basic judgment basis is within the mean value certain range.
STEP3: Update parameters, the background to learn updates, which is the learning rate parameters, the larger the value of the background update faster.
STEP4: Repeat steps 2, 3 until the algorithm stops.
Hybrid Gaussian modeling GMM (Gaussian Mixture model) as a single-core Gaussian background modeling extension, is currently the most widely used method, GMM describes the background model as multiple distributions (can meet the background of the switch, such as leaf sway), in line with one of the distribution model (leaves, The pixels that are not leaves) are the background pixels.
GMM has a good adaptability to the complex background, its performance is closer to practical application, the concrete realization of the method can refer to OpenCV source code.
As one of the most commonly used background modeling methods, GMM has many improved versions, such as the use of texture complexity to update the difference threshold, and the dynamic adjustment of the learning rate through the intensity of pixel changes, which is no longer further expanded.
Vibe Method
The main feature of the vibe algorithm is the random background update strategy, which is very different from GMM.
There is uncertainty in the change of pixels, which is difficult to characterize with a fixed model, based on the assumption that the stochastic model is somewhat more suitable for simulating the uncertainty of pixel changes when the model of pixel changes is not vibe.
initialization of the background model
The VIBE algorithm describes the background model by defining a set of K (typically 20) samples:
The algorithm uses a single frame image to initialize the background model, and the background model is obtained by random sampling of pixels around pixels, so it is also called the sampling background model, which is a significant advantage of the vibe algorithm.
R,c corresponds to the 8 neighborhood points, the advantage of this method is that it can greatly shorten the time of setting up the background, and can learn quickly when the background changes greatly.
Foreground detection Process
Calculates the distance (pixel value difference) of each sample value in the new pixel pi,j and sample set si,j, when the calculated result is less than the given threshold disti,j (typically set to 20), and is considered to be approximate to the specified sample when the approximate number of samples is greater than #min (value range [2, K/2], which is generally set to 2) Think that the pixel belongs to the background, otherwise judged as the foreground.
update strategies for background models
According to the background model of the pixel, will participate in the model update strategy, the update strategy is relatively simple, randomly select the corresponding sample set Si,j a sample replaced with the current pixel point pi,j value.
Assuming that the learning rate is LR (the inverse of the update probability, the general value is 2-64, the smaller the value, the faster the update), then the update policy can be expressed as:
1) When a pixel point pi,j is judged as the background, it has 1/LR probability to update its corresponding sample set SI,J, when the condition conforms, randomly chooses one of the sample values to replace;
2) At the same time there is the probability of 1/LR to update its neighbor point of the model sample set, that is, when the probability condition is met, randomly select one of the 8 neighborhood points Pr,c (R, C is a row and column index), and then randomly select the neighbor point corresponding to the sample set Sr,c a sample k, replaced by the value of pixel pi,j;
Vibe algorithm principle is relatively easy to understand, the effect is relatively good, we can find the relevant code from the Internet to test and effect, but because the original author has applied for the relevant patents, if for commercial purposes, it is recommended to consider other algorithms. After the pre-background detection is complete, we want to get the complete target area or contour, this step is done by extracting the image foreground mask, the typical method is called Motionblob.
There are many ways to further deal with foreground pixels, such as ghost Ghost removal, combined with tracking to detect the target of the error detection, not covered in this section, please consult the reader on their own.