OPENCV provides two programs that can train their own cascading classifiers opencv_haartraining and Opencv_traincascade. Opencv_traincascade is a new program that is written in C + + using the OpenCV 2.x API. The main difference is that Opencv_traincascade supports both Haar and LBP (local Binary Patterns), and it is easy to add other features. Compared with the Haar feature, LBP features are integer features, so the training and detection process can be several times faster than Haar features. LBP and Haar features are used to detect the accuracy rate, which is dependent on the training data quality and training parameters in the training process. It is possible to train a classifier of LBP with the same accuracy based on haar characteristics.
Similar to the training methods of other classifier models, training data and test data are also needed, and the training data includes positive sample POS and negative sample neg. Training program Opencv_haartraining.exe and Opencv_traincascade.exe to enter the data format is required, so need the relevant auxiliary program:
Opencv_createsamples is used to prepare positive sample data and test data for training purposes. Opencv_createsamples can generate positive sample data that can be supported by opencv_haartraining and Opencv_traincascade programs. Its output is a file with the *.vec extension, which stores the image in binary form.
So the OPENCV-cascade classifier training and testing can be divided into the following four steps:
Prepare training Data Train cascade classifier test classifier performance target detection using a trained classifier
1. Prepare training Data
Note: Take pedestrian data as an example, introduce the classifier training 1.1 prepare positive sample
A positive sample is generated by Opencv_createsamples. A positive sample can be generated by a picture that contains the object to be detected, or by a series of tagged images.
First put all the positive samples in a folder, as shown in Figure 1. The Pos.dat file is a list file for all images, formatted as shown in Figure 2: Where the first column is the image name, the second column is the number of samples in the image, the last is the position of the positive sample in the image and the size of the positive sample that needs to be plucked. Pos.dat file generation mode: In the DOS window into the POS folder, input dir/b > Pos.dat; This will only generate a list of file names, followed by the number of positive samples and location dimensions need to be manually added.
Figure 1. Positive Sample Data
Figure 2
1.2 Opencv_createsamples.exe generates a positive sample file for the. Vec format
You can run the Opencv_createsamples.exe program in the DOS command window, or you can use the. bat batch file, and the. bat batch file is formatted as follows.
"D:\Program files\opencv\build\x64\vc10\bin\opencv_createsamples.exe"-info "Pos\pos.dat" -vec pos.vec-num 500- W 64-h 128
Command line arguments for the Opencv_createsamples.exe program:
-info <collection_file_name> describes the image and size position of the object in the description file. -vec <vec_file_name> output files, containing positive samples for training. -img <image_file_name> Enter image filename (for example, a company logo). -bg<background_file_name> Background image description file containing a series of image file names, these images will be randomly selected as the background of the object. Number of positive samples generated by-num<number_of_samples>. -bgcolor<background_color> background color (currently grayscale); The background color indicates a transparent color. Because the image compression can cause color deviation, the tolerance of the color can be specified by-bgthresh. All pixels between Bgcolor-bgthresh and Bgcolor+bgthresh are set to transparent pixels. -bgthresh <background_color_threshold>-INV If you specify this flag, the color of the foreground image will flip. -RANDINV If you specify this flag, the color will flip randomly. -maxidev<max_intensity_deviation> the maximum value of the pixel brightness gradient in the foreground sample. The maximum rotation angle of the-maxxangle <max_x_rotation_angle>x axis must be in radians. The maximum rotation angle of the-maxyangle <max_y_rotation_angle>y axis must be in radians. The maximum rotation angle of the-maxzangle<max_z_rotation_angle>z axis must be in radians. -show is useful for debugging options. If you specify this option, each swatch will be displayed. If you press the ESC key, the program continues to create the sample but no longer displays it. -W <sample_width> the width, in pixels, of the output sample. -h<sample_height> the height, in pixels, of the output sample.
1.3 Prepare negative samples
A negative sample can be any image, but these images cannot contain objects to be detected. The image file name used to extract the negative sample is listed in a Neg.dat file. The build is the same as a positive sample, but it is only possible to include a list of file names. This file is a plain text file, each row is a file name (including relative directories and file names) These images can be different sizes, but the image size should be larger than the size of the training window, because the images will be used to pull the negative sample and reduce the negative sample to the training window size. As shown in Figure 3.
Figure 3 2, training cascade classifier
OPENCV provides two programs that can be trained for cascading classifiers: Opencv_haartraining and Opencv_traincascade. Opencv_haartraining is a program that will be discarded; Opencv_traincascade is a new program.
The format of the. bat batch file for the Opencv_traincascade program is as follows:
"D:\Program files\opencv\build\x64\vc10\bin\opencv_traincascade.exe" -data "E:\2013Mycode\ Traincascadeclassification "-vec pos.vec-bg neg\neg.dat-numpos 29-numneg 29-numstages 20-mem 200-featuretype LBP
-w 64-h 128
Pause
The command line arguments for Opencv_traincascade are as follows:
1. General Parameters:
-data <cascade_dir_name> directory name, such as no training program will create it to store the trained classifier. -vec <vec_file_name> VEC file name (generated by the Opencv_createsamples program) that contains a positive sample. -BG <background_file_name> Background description file, which is the description file that contains the file name of the negative sample. -numpos <number_of_positive_samples> number of positive samples to be used for each level classifier training. -numneg <number_of_negative_samples> The number of negative samples used for each classifier training can be greater than the number of pictures specified by-BG. -numstages <number_of_stages> Training Classifier series. The-precalcvalbufsize<precalculated_vals_buffer_size_in_mb> cache size, which is used to store the predefined eigenvalues (feature values) in megabytes. -precalcidxbufsize<precalculated_idxs_buffer_size_in_mb> cache size, used to store a predefined feature index (feature indices) in megabytes. The larger the memory, the shorter the training time. -baseformatsave This parameter is only valid when using the Haar feature. If this parameter is specified, the cascading classifier is stored in the old format.
2. Cascade Parameters:
-stagetype <boost (default) > Level (stage) parameter. Only the boost classifier is currently supported as a level type. -featuretype<{haar (default), type of lbp}> feature: HAAR-Class HAAR feature, LBP-local texture pattern feature. Dimensions of the-W <sampleWidth> H <sampleHeight> Training sample (in pixels). Must be consistent with the size of the training sample created (created using the Opencv_createsamples program).
3. Classifier Parameters:
-BT <{dab, RAB, Lb,gab (default)}> boosted classifier parameters: Dab-discrete AdaBoost, Rab-real AdaBoost, Lb-logitboost, GAB- Gentle AdaBoost. Type of boosted classifier: the minimum detection rate that is desired for each level of the-minhitrate<min_hit_rate> classifier. The total detection rate is about min_hit_rate^number_of_stages. The maximum error rate that the-maxfalsealarmrate<max_false_alarm_rate> classifier expects at each level. The total false detection rate is about max_false_alarm_rate^number_of_stages. -weighttrimrate <weight_trim_rate> Specifies whether Trimmingshould is used and its weight. A good number is 0.95. -maxdepth <max_depth_of_weak_tree> weak classifier tree maximum depth. A good value is 1, which is a two-fork tree (stumps). -maxweakcount<max_weak_tree_count> the maximum number of weak classifiers in each level. The Boostedclassifier (stage) would have so many weak trees (<=maxweakcount), as neededto the achieve Rmrate.
4. Class Haar Feature parameters:
-mode <basic (default) | CORE | All> Select the type of Haar feature used during the training process. BASIC uses only upper-right features, all using all upper-right features and 45-degree rotation features.
5.LBP Feature Parameters:
The LBP feature has no parameters. 3. The test classifier performance opencv_performance can be used to evaluate the quality of the classifier, but only the classifier that evaluates the opencv_haartraining output. It reads a set of labeled images, runs the classifier, and reports performance, such as the number of objects detected, the number of missed misses, the number of false detections, and other information. Also prepare test data set tests, Generate image list file, format and trainer is the same sample image list, need to mark the number and location of the target file.
3.1 opencv_haartraining Program training a classifier model in order to understand the usage of opencv_performance, a classifier model is trained with Opencv_haartraining program, and the method and OpenCV _traincascade similar.
The format of the. bat batch file for the Opencv_haartraining program is as follows:
"D:\Program files\opencv\build\x64\vc10\bin\opencv_haartraining.exe" -data "E:\2013Mycode\ Traincascadeclassification\cascade "-vec pos.vec-bg neg\neg.dat-npos 29-nneg 29-mem 200-mode basic-w 64-h 128
P Ause
The command line arguments for opencv_haartraining are as follows:
The
-data<dir_name> the path name of a trained classifier. -vec<vec_file_name> the file name of the positive sample (created by the Trainingssamples program or by another method)-bg<background_file_name> the background description file. -npos<number_of_positive_samples>,-nneg<number_of_negative_samples> is used to train a positive/negative sample for each classifier phase. Reasonable values are: NPOs = 7000;nneg= 3000-nstages<number_of_stages> The number of stages of training. -nsplits<number_of_splits> determines the weak classifier for the stage classifier. If 1, then a simple stump classifier is used. If it is 2 or more, the cart classifier with number_of_splits internal nodes is used. -mem<memory_in_mb> the available memory in megabytes (MB) in advance. The greater the memory, the faster the training. -sym (default)-nonsym Specifies whether the target object of the training is vertically symmetric. Vertical symmetry increases the training speed of the target. For example, the front part is vertically symmetric. -minhitrate "Min_hit_rate" the minimum hit ratio required by the classifier for each phase. The total hit rate is the Min_hit_rate number_of_stages the second side.