from: Moving target detection under moving background
Introduction to various target detection methods (lazy people can skip directly)
Target detection is an old topic, and in many algorithms there is a shadow of it. Target detection has two things to do: detect if there are any targets in the current picture. If there is, where. According to the prior knowledge and background movement to divide, the target detection method can be divided into two major categories:
First, a priori knowledge of the known target. In this case, there are two kinds of methods to detect the target, the first kind of method is to train a bunch of weak classifiers with the prior knowledge of the target, and then these weak classifiers vote together to detect the target, such as boosting, random forest is this idea, the familiar adaboost face detection is the same, This kind of method I will discuss in later articles. The second kind of method is to find the best dividing line of target and non-target based on prior knowledge, such as SVM. These two kinds of methods each become one, the strengths, have a good performance.
Second, a priori knowledge of the unknown target. At this point we do not know what the target is, so what is the target has a different definition. One approach is to detect significant targets in the scene, such as expressing the significant probability of each pixel in the scene with some features, and then finding a significant target. Another way is to detect the moving objects in the scene, which is the focus of this article to be discussed below.
When detecting a moving target, if the background is still, so easy, skip over. There are two ways to deal with the problem when the background is followed by movement. The first method is background compensation, that is, by panning, zooming, affine transformations, and so on to calculate the background motion, and then compensate the background and then do the difference. However, this method has two problems, one is the affine transformation operation is huge, and the second is that even if the background compensation vector, the background of the foreground and close-up of the vector will also have a relative error, so this method is almost impossible. The second method is the legendary optical flow (light flow), which enters the text below.
Body
The approximate flow of the optical flow method is as follows:
1. Select a large number of optical flow points in a frame image (the specific selection method can be different, such as fast corner point, random selection, etc.).
2. Calculate the motion vectors for all optical flow points (common methods are LK, HS, etc.).
3. Detect moving targets based on these vectors and some other features.
The following is an analysis of a specific example
1. First randomly and evenly select K points within a frame image, and filter out those points whose neighborhood texture is too smooth, because these points are not conducive to the calculation of optical flow
2. Calculate the amount of light vectoring between these points and the previous frame, as shown on the right, and you can see the approximate direction of the background movement.
3. The next step in this approach varies from person to person.
2007 CVPR An article detection and segmentation of moving objects in highly dynamic scenes method is to put these optical flow points (x, y, dx, dy, y, U, V) 7 features Pass The Meanshift cluster is aggregated to form a moving target contour.
And my method is very simple, only used (dx, dy) Two characteristics, such as the upper left, the first projection of all the light flow point to the Cartesian coordinates, the figure of the axis is (dx,dy), and then through the Meanshift to find the most dense (dx, dy) coordinate points, That is, the most concentrated position of the background vector (the greater the brightness of the midpoint of the image represents the greater the density of the position vector), as shown in the red circle, the vector outside the red circle can be considered a moving target, as shown in the image on the right.
Add:
Recently asked me questions too many people, to add a few sets of experimental pictures for your reference:
The first group is the result of two features detected with Dx,dy:
The second group, based on DX, DY, is characterized by a significant (saliency) feature:
Third group Ibid.