Moving Target Detection in Real-time Monitoring Based on DCT coefficients

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The main content of this article comes from2009 advanced video and signal based SurveillanceA paper of the meeting"Real-time moving object detection for video surveillance ",For more information, see the link provided later. Statement 2: ① this article was translated Based on the mentioned paper, but not exactly the same as the original article; ② the code implementation section has a logic problem in the detect function section, do not speak out if the requirements are not met. This paper will be introduced below.

A preliminary review of this document is due to a blog post on the Internet, which has made great comments on this article and has gained some interest in this article, this may also be related to your own background and has been engaged in research and work in this area for a long time. The idea in this paper is very simple. It is roughly described as follows: divide the image into 4 × 4 patches that do not overlap with each other, and then perform DCT (discrete cosine transformation) on each patch) transform (see the relevant literature for the differences between DCT and ICA and PCA), and then extract the low-frequency components of DCT coefficients as features for background modeling. For new input image frames, the same processing is done, which is compared with the extracted background model features to determine whether the features are similar. The Spatial Neighborhood mechanism is used to control the noise to achieve accurate foreground extraction. The following describes each part in detail based on the framework of the thesis.

1) Background Modeling (background modelling)

The background model is composed of multiple DCT coefficient vectors. The background patches in Different spaces may have different coefficient vectors. Every patche is transformed according to the DCT formula, such as formula (1 ):

After the transformation, the DCT coefficient matrix can be obtained, as shown in 1:

According to the characteristics of DCT transformation, the extracted locations are (1, 2), (1, 3), (2, 1), (2, 2), (3, 1) Five coefficients constitute a coefficient matrix, as the background model. After processing each patch in sequence, the background model is created.

2) Background adaptation)

Considering the dynamic changes in the scenario and the impact of noise, it is difficult to adapt to the noise and dynamic scenarios based on the background model established above. To meet the needs of dynamic scenarios, it is necessary to conduct in-depth research on the adaptability of the background model. For
Newly coming patch, coefficient vector, is compared with the background model to determine whether it is similar. Similar judgment is based on whether the angle between the two vectors is greater than a certain threshold, find the most matched model and update the weight coefficient of the model as follows:

Tinc and tdec are constants, alphai is the weight corresponding to the model, and the initial weight of each model is tinc.If there is no matching, this is followed by the focusIn this case, you need to determine whether the patch has the most potential prospect in the neighborhood of the previous frame (almost
Foreground) patch. If no patch is available, it is named almost foreground patch and incorporated into the background model.

3) Foreground Detection)

The background adaptability of the foreground patch is similar to that of the foreground patch. I just mentioned that the article incorporates the motion targets that have been stuck in the scenario into the background model, which can improve the algorithm performance. Of course, there are special requirements (such as left object detection) objects that are stranded in the scene may need to be retained.

Finally, add some personal opinions. We normally model the background in the spatial domain, while the author divides the image into patches and uses DCT transformation for each patch to model each patch in the frequency domain, the idea is also a major innovation. In addition, the author did not retain all the DCT coefficients, but extracted the transformed coefficients that indicate low-frequency information (this can reduce the details and retain the structure information, modeling of the background to reduce the computing workload. Of course, this article also has shortcomings. Patch-based detection is not suitable for scenarios with high detection accuracy requirements, and is not applicable to some originally unrelated goals, after patch-based detection, it may be stuck together, especially for the next multi-object tracking or target recognition.

I have also conducted experiments on the original text algorithm, but the ability is limited. There are some problems in the algorithm implementation process. If you are interested, you can analyze it (the problem is mainly caused by detect () of the dctdetect class () function), of course, you can also download it directly through the link after the text.

The header file dctdetect. HPP is as follows:

# Pragma once # include <opencv2/CORE/core. HPP> # include <opencv2/highgui. HPP> # include <opencv2/imgproc. HPP> # include <cmath> # include <iostream> using namespace STD; using namespace CV; // bounding boxesstruct boundingbox: Public CV: rect {boundingbox () {} boundingbox (CV: rect R): CV: rect (r) {} public: int status; // status: 0 indicates the background, 1 indicates the foreground, 2 indicates that the foreground int count may be displayed; // the number of times that the foreground is marked as int prev_status; // The last status}; typede F struct _ ELEM {vector <float> m_data; float m_weight;} ELEM; // defines the new data structure typedef vector <ELEM> data; // define the data type class dctdetect {public: dctdetect (void); dctdetect (MAT & frame); void detect (MAT & frame); // detect mat & getforeground () {return m_foreground ;}~ Dctdetect (void); Private: void calcdct (MAT & frame, vector <float> & coff); float calcdist (vector <float> & coff1, vector <float> & coff2 ); void buildgrid (MAT & frame, rect & Box); float dotproduct (vector <float> & coff1, vector <float> & coff2); bool checkneighbor (int r, int C ); void chagegridstatus (); Private: int m_height; int m_width; rect m_rect; int m_frmnum; int m_gridrows; // number of model rows int m_gridcols; // number of model columns mat m_foreground; float m_threshold; // threshold float m_inc; float m_dec; vector <boundingbox> m_grid; vector <DATA> m_model ;};

The dctdetect. cpp file is implemented as follows:

# Include "dctdetect. H "dctdetect: dctdetect (void) {} dctdetect: dctdetect (MAT & frame) {m_frmnum = 0; m_gridcols = 0; m_gridrows = 0; m_inc = 1.0; m_dec = 0.1; // m_threshold = 0.50; m_threshold = sqrtf (3.0)/2.0; // cos (45 °) = sqrtf (2.0)/2m_height = frame. rows; m_width = frame. cols; m_rect.x = 0; m_rect.y = 0; m_rect.width = 4; m_rect.height = 4; m_foreground.create (m_height, m_width, cv_8uc1); buildgrid (frame, M_r ECT); vector <float> coff; ELEM _ ELEM; vector <ELEM> _ data; vector <DATA> v_data; For (INT I = 0; I <m_gridrows; ++ I) {v_data.clear (); For (Int J = 0; j <m_gridcols; ++ J) {_ data. clear (); calcdct (frame (m_grid [I] [J]), coff); _ ELEM. m_data = coff; _ ELEM. m_weight = m_inc; _ data. push_back (_ ELEM); v_data.push_back (_ data);} m_model.push_back (v_data) ;}} void dctdetect: buildgrid (MAT & frame, rect & Box) {int width = Box. WID Th; int Height = Box. height; boundingbox bBox; vector <boundingbox> Ingrid; For (INT y = 1; y <frame. rows-height; y + = height) {Ingrid. clear (); m_gridcols = 0; For (INT x = 1; x <frame. cols-width; x + = width) {bBox. X = x; bBox. y = y; bBox. width = width; bBox. height = height; bBox. status =-1; bBox. prev_status = 0; bBox. count = 0; Ingrid. push_back (bBox); m_gridcols ++;} m_grid.push_back (Ingrid); m_gridrows ++ ;}/// calculate the DCT coefficient void dctd Etect: calcdct (MAT & frame, vector <float> & coff) {If (frame. empty () return; MAT temp; if (1 = frame. channels () frame. copyto (temp); elsecvtcolor (frame, temp, cv_bgr2gray); MAT tempmat (frame. rows, frame. cols, cv_64fc1); MAT tempdct (frame. rows, frame. cols, cv_64fc1); temp. convertize (tempmat, tempmat. type (); DCT (tempmat, tempdct, cv_dxt_forward); // DCT transform coff. clear (); coff. push_back (float) tempdct. at <Double> (); // values: (), (), and () coff. push_back (float) tempdct. at <double> (0, 2); coff. push_back (float) tempdct. at <double> (1, 0); coff. push_back (float) tempdct. at <double> (1, 1); coff. push_back (float) tempdct. at <double> (2, 0); If (! Temp. Empty () temp. Release (); If (! Tempmat. Empty () tempmat. Release (); If (! Tempdct. empty () tempdct. release () ;}// calculate the float dctdetect: calcdist (vector <float> & coff1, vector <float> & coff2) {float d1 = norm (coff1 ); float D2 = norm (coff2); float D3 = dotproduct (coff1, coff2); If (D2 <0.0001) return 1.0; elsereturn D3/(D1 * D2 );} // Point product float dctdetect: dotproduct (vector <float> & coff1, vector <float> & coff2) {size_t I = 0, n = coff1.size (); Assert (coff1.size () = coff2.size (); float S = 0.0f; const float * ptr1 = & coff1 [0], * ptr2 = & coff2 [0]; for (; I <n; I ++) S + = (float) ptr1 [I] * ptr2 [I]; return s;} // check whether the adjacent area has prospects. If yes, return truebool dctdetect: checkneighbor (INT R, int c) {int COUNT = 0; If (r-1)> = 0 & m_grid [r-1] [C]. prev_status = 1) // The above patchcount ++; If (C + 1) <m_gridcols & m_grid [r] [C + 1]. prev_status = 1) // patchcount ++ on the right; if (R + 1) <m_gridrows & m_grid [R + 1] [C]. prev_status = 1) // The following patchcount ++; If (c-1)> = 0 & m_grid [r] [C-1]. prev_status = 1) // patchcount ++ on the left; If (count> 1) return true; elsereturn false;} void dctdetect: Detect (MAT & frame) {m_foreground = 0; float Dist = 0.0f; vector <float> coff; ELEM _ ELEM; // single data vector <ELEM> _ data; // model data for (INT I = 0; I <m_gridrows; ++ I) {for (Int J = 0; j <m_gridcols; ++ J) {calcdct (frame (m_grid [I] [J]), coff); _ DATA = m_model [I] [J]; Int mnum = _ data. size (); // number of models float Fmax = flt_min; int idx =-1; for (int K = 0; k <mnum; ++ K) {Dist = calcdist (coff, _ DATA [K]. m_data); If (Dist> fmax) {Fmax = DIST; idx = J ;}// if (fmax> m_threshold) // match {for (int K = 0; k <mnum; ++ K) {If (idx = J) // Add m_model [I] [J] [k] to the model weight for matching. m_weight + = m_inc; elsem_model [I] [J] [K]. m_weight-= m_dec;} else // if no match is found, check whether there is a prospect in the last neighbor {bool isneighbor = Checkneighbor (I, j); If (isneighbor) // if there is a foreground in the neighborhood, mark it as the foreground region {m_foreground (m_grid [I] [J]) = 255; m_grid [I] [J]. count + = 1;} else {m_grid [I] [J]. status = 1; _ DATA = m_model [I] [J]; // Add the background model _ ELEM. m_data = coff; _ ELEM. m_weight = m_inc; _ data. push_back (_ ELEM); m_model [I] [J] = _ DATA ;}// removes the model vector <ELEM> _ temp; _ DATA = m_model [I] [J]; mnum = _ data. size (); For (int K = 0; k <mnum; ++ K) {If (_ DATA [K]. m_weight <0) cont Inue; else {If (_ DATA [K]. m_weight> 20.0) _ DATA [K]. m_weight = 20.0; _ temp. push_back (_ DATA [k]);} _ data. clear (); _ data. insert (_ data. begin (), _ temp. begin (), _ temp. end (); m_model [I] [J] = _ data;} // end for j} // end for ichagegridstatus ();} void dctdetect: chagegridstatus () {for (INT I = 0; I <m_gridrows; ++ I) {for (Int J = 0; j <m_gridcols; ++ J) {m_grid [I] [J]. prev_status = m_grid [I] [J]. status; m_grid [I] [J]. s Tatus = 0 ;}} dctdetect ::~ Dctdetect (void ){}

Thesis: Real-time moving object detection for video surveillance

Program code: Based on DCT coefficient Background Modeling and Moving Target Detection Algorithm V1.0

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Moving Target Detection in Real-time Monitoring Based on DCT coefficients

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support