Smoke detection note "video-based smoke detection with histogram sequence of LBP and LBPV pyramids" analysis, implementation

Source: Internet
Author: User

Based on the characteristics of hep (histograms of equivalent patterns "1"), which has good texture classification effect, LBP (local binary patterns "2") is the most commonly used feature under the HEP Framework and has a brightness, Rotation and other good invariant properties. In the block-based video smoke detection, it is often used as a feature of texture classification. However, the image of a block is localized. This article mainly proposes the use of image pyramid method to extract the smoke block feature has a certain global properties. It will be detected smoke blocks constitute a 3-level pyramid, and then each level of the pyramid to extract different patterns of LBP features, constitute a histogram sequence as a feature vector, and finally the use of Neural network classification.

1.LBP and LBPV

There are three types of LBP in this paper: Unified mode, rotation invariant mode, and uniform rotation invariant mode. Each level of pyramid takes a pattern. LBP See blog "3"

LBPV and LBP, the LBP statistic histogram is when each LBP (I,j) makes the K component plus 1, and the LBPV is a var (variance bar), the Var calculation formula:

The above GP is the domain pixel value and then normalized.

2. Pyramids

A total of three levels of pyramid, i0,i1,i2,i0 for the input image, I0 through the Gaussian lowpass filter (Gaussian low pass filter, LPF), and then the next sample to get I1 (sample size 2), also by I1 to get I2,

Finally, from bottom to top, each level is extracted in a unified mode LBP, rotation unchanged LBP, uniform rotation of LBP, in the order of the following combinations into vectors:

For 24x24 image block, the size of I2 is only 6x6, which will get sparse eigenvector, which is not conducive to classification. As a result, I1 and I2 increase the search window through neighboring pixels.

3. Implement

I use the data set in the article as training and testing, cut into 96x96 gray scale, so that the size of the image under the sample two times just 24x24. Extract each image feature code:

feature_t get_pyramid_feature (Cv::mat &img) {Assert (Img.dims==2) && (Img.rows = = the) && (Img.cols = = the));    feature_t result, Lbp0, LBP1, LBP2, Lbpv0, Lbpv1, Lbpv2; Result.resize ( About);    Cv::mat I0, I1, I2, L0, S0, L1; I0= IMG (Cv::rect ( $, $, -, -) . Clone (); Cv::gaussianblur (IMG, L0, cv::size (3,3),0); Cv::resize (L0, S0, Cv::size ( -, -)); I1= S0 (Cv::rect ( A, A, -, -) . Clone (); Cv::gaussianblur (S0, L1, Cv::size (3,3),0); Cv::resize (L1, I2, Cv::size ( -, -)); Lbp0=Get_u_lbp_gray (I0); LBP1=Get_ri_lbp_gray (I1); LBP2=Get_riu_lbp_gray (I2); Lbpv0=Get_u_lbpv_gray (I0); Lbpv1=Get_ri_lbpv_gray (I1); Lbpv2=Get_riu_lbpv_gray (I2);    Std::copy (Lbp0.begin (), Lbp0.end (), Result.begin ()); Std::copy (Lbpv0.begin (), Lbpv0.end (), Result.begin ()+ -); Std::copy (Lbp1.begin (), Lbp1.end (), Result.begin ()+118); Std::copy (Lbpv1.begin (), Lbpv1.end (), Result.begin ()+154); Std::copy (Lbp2.begin (), Lbp2.end (), Result.begin ()+ the); Std::copy (Lbpv2.begin (), Lbpv2.end (), Result.begin ()+ $); returnresult;}

After extracting features, neural network training (sampling TINY_CNN, a lightweight neural network library) is used:

voidMlp_train (std::vector<feature_t> & train_x, std::vector<int> & train_y, Std::vector<feature_t> & test_x, std::vector<int> & Test_y,Const Char* Weights_file,intIter_num = -){    Const intNum_input = train_x[0].size (); Const intNum_hidden_units = -; intNum_units[] = {num_input, num_hidden_units,2 }; Auto NN= Make_mlp<mse, Gradient_descent_levenberg_marquardt, tan_h> (num_units, Num_units +3); //Train MLPNn.optimizer (). Alpha =0.005;        Boost::p Rogress_display disp (train_x.size ());        Boost::timer T; //Create callbackAuto On_enumerate_epoch = [&] () {std::cout<< t.elapsed () <<"s elapsed."<<Std::endl; Tiny_cnn::result Res=nn.test (test_x, test_y); Std::cout<< Nn.optimizer (). Alpha <<","<< res.num_success <<"/"<< Res.num_total <<Std::endl; Nn.optimizer (). Alpha*=0.85;//Decay Learning RateNn.optimizer (). Alpha = Std::max (0.00001, Nn.optimizer (). Alpha);            Disp.restart (Train_x.size ());        T.restart ();        }; Auto On_enumerate_data= [&](){             ++disp;          }; Nn.train (train_x, train_y,1, Iter_num, On_enumerate_data, On_enumerate_epoch);        Nn.test (test_x, test_y). Print_detail (Std::cout); Nn.save_weights (weights_file);}

Finally the results of the test are more than 96%.

Then used in the video processing, only to deal with the situation of a single image. For a pair of images, first the scoring block, each chunk of the area is saved to the vector. Each level of the pyramid is considered at the time of chunking, so the corresponding position of each vector at level 3 is the corresponding block. The processing of edge blocks ignores pixels outside the image. The chunking code is as follows:

Std::vector<std::vector<cv::rect>> Make_image_blocks (ConstCv::mat &inch,Const intIWD) {std::vector<std::vector<cv::Rect>>result; Std::vector<cv::Rect>level0, Level1, Level2; introws =inch. Rows; intcols =inch. cols; intRows_level1 = rows/2; intCols_level1 = cols/2; intRows_level2 = rows/4; intCols_level2 = cols/4; intLeft , top, right, bottom;  for(inti =0; I <= rows-iwd; i + =IWD) {         for(intj =0; J <= Cols-iwd; J + =IWD)            {Level0.push_back (Cv::rect (J, I, IWD, IWD)); //Level1left = Std::max (j/2-6,0); Top= Std::max (i/2-6,0); Right= Std::min (j/2+ -, Cols_level1); Bottom= Std::min (i/2+ -, Rows_level1); Level1.push_back (Cv::rect (left, top, right)-left, bottom-top)); //Level2left = Std::max (j/4-9,0); Top= Std::max (i/4-9,0); Right= Std::min (j/4+ the, Cols_level2); Bottom= Std::min (i/4+ the, Rows_level2); Level2.push_back (Cv::rect (left, top, right)-left, bottom-top));    }} result.push_back (Level0);    Result.push_back (LEVEL1);    Result.push_back (LEVEL2); returnresult;}

Then it is the processing of each image, first create a 3-level pyramid, and then follow the above-mentioned block index to easily get the characteristics of each block, and then predict is done.

Template<typename nn>std::vector<int> Single_image_smoke_detect (ConstCv::mat & IMG,ConstStd::vector<std::vector<cv::rect>> & Locate_list,Conststd::vector<int> Smoke_block, NN &nn) {std::vector<int>result;    Cv::mat I1, I2, L0, L1; Cv::gaussianblur (IMG, L0, cv::size (3,3),0); Cv::resize (L0, I1, Cv::size (Img.cols/2, img.rows/2)); Cv::gaussianblur (I1, L1, Cv::size (3,3),0); Cv::resize (L1, I2, Cv::size (I1.cols/2, i1.rows/2));  for(Auto I:smoke_block) {Cv::mat block= IMG (locate_list[0][i]); Auto Lbp0=Get_u_lbp_gray (block); Auto Lbpv0=Get_u_lbpv_gray (block); Block= I1 (locate_list[1][i]); Auto LBP1=Get_ri_lbp_gray (block); Auto Lbpv1=Get_ri_lbpv_gray (block); Block= I2 (locate_list[2][i]); Auto LBP2=Get_riu_lbp_gray (block); Auto Lbpv2=Get_riu_lbpv_gray (block);        feature_t feat; Feat.resize ( About,0);        Std::copy (Lbp0.begin (), Lbp0.end (), Feat.begin ()); Std::copy (Lbpv0.begin (), Lbpv0.end (), Feat.begin ()+ -); Std::copy (Lbp1.begin (), Lbp1.end (), Feat.begin ()+118); Std::copy (Lbpv1.begin (), Lbpv1.end (), Feat.begin ()+154); Std::copy (Lbp2.begin (), Lbp2.end (), Feat.begin ()+ the); Std::copy (Lbpv2.begin (), Lbpv2.end (), Feat.begin ()+ $);        Vec_t y; Nn.predict (feat,&y); Const intPredicted =Max_index (y); if(predicted = =1) Result.push_back (i); }    returnresult;}

Before this is added motion detection:

Smoke_blocks = Single_image_smoke_detect (gray_img, locate_list, smoke_blocks, nn);

Then the detected smoke blocks are displayed:

 - );              for (auto rect:merged_blocks)            {                Cv::rectangle (img, rect, cv::scalar (0,0,255));            }                        Cv::imshow ("smoke", IMG);

1. Analysis of experimental results

I tried it. Using only LBP or LBPV, the classification effect of using LBPV alone is the worst, the average of more than 94% tests, and the LBP and LBP+LBPV are about the same, and the average is above 96%, so no LBPV is better.

2. Summary

Adding a pyramid to each level of extracting features is a little better than using separate blocks to extract features, but for smoke detection, single-frame detection is not enough to be true. This can be used as a novel method in a frame.

6. References

"1" Texture description through histograms of equivalent patterns

"2" multiresolution gray-scale and rotation invariant texture classification with local binary patterns

"3" http://blog.csdn.net/zouxy09/article/details/7929531

Smoke detection note "video-based smoke detection with histogram sequence of LBP and LBPV pyramids" analysis, implementation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.