How to Use opencv to train your Classifier

Source: Internet
Author: User

How to Use opencv to train your Classifier

 

The original Article address is as follows. Thank you for your selfless dedication:

Http://hi.baidu.com/andyzcj/blog/item/3b9575fc63c3201f09244d9a.html

 

I. Introduction
The target detection method was initially proposed by Paul Viola [vila01] and by Rainer lienhart
[Lienhart02] This method has been improved. The basic steps of this method are as follows: First, use the Harr of the sample (about several hundred sample images)
Feature training to obtain a cascade boosted classifier.
In a classifier, "cascade" means that the final classifier is composed of several simple classifier Cascade. In image detection, the detected window uses each level of classifier in sequence, so that most of the candidate regions in the previous layers of detection are excluded, all the regions detected by each level of classifier are the target regions.
After the classifier is trained, it can be applied to the detection of areas of interest in the input image. The output of the classifier in the target region is 1. Otherwise, the output is 0. To detect the entire image, you can move the search window in the image
Port, detect each location to determine the possible target.
In order to search for objects of different sizes, the classifier is designed to be able to change the size, which is more effective than changing the size of the image to be examined. Therefore, to detect an unknown target object in an image, scan
The program usually needs to scan the image several times in a search window of different proportions.
Currently, four types of boosting technologies are supported: discrete Adaboost, real Adaboost, gentle AdaBoost and logitboost.
"Boosted" means that each layer of the cascade classifier can select a boosting algorithm (weight voting) and use the self-training of the basic classifier.
According to the above analysis, the target detection is divided into three steps:
1. Create a sample
2. Train the Classifier
3. Use a trained classifier for target detection.

Ii. Sample Creation
Training samples are divided into positive samples and inverse samples. Positive Samples refer to the target samples to be checked, and negative samples refer to any other images.
Negative sample

Negative samples can come from any image, but these images cannot contain the target features. Negative samples are described in the background description file. The background description file is a text file. Each line contains a negative sample image file name (based on the relative path of the description file ). The file can be created as follows:

Use the doscommand to generate a sample description file. The specific method is to enter your image directory under DOS. For example, if my image is placed under D:/face/posdata, then:

Press Ctrl + R to open the windows running program, Enter cmd to open the DOS command window, and enter
D: Press enter, then enter cd d:/face/negdata to enter the image path, and then enter DIR/B>
Negdata. DAT: A negdata. dat file is generated under the image path. open the file and delete negdata. dat in the last line. This generates a negative sample description.
File. The result of the doscommand window is as follows:


Positive sample

For positive samples, the common practice is to crop all positive samples and normalize the size (that is, scale to the specified size), as shown in:


Because the positive sample entered during haartraining training is a VEC file, you need to use the createsamples program that comes with opencv (
If you do not need to compile the. DSW file under opencv/apps/haartraining/make in opencv/bin, you must compile the release version.
To convert the prepared positive sample to the VEC file. The conversion procedure is as follows:

1) create a positive sample description file to describe the positive sample file name (including the absolute or relative path), the number of positive samples, and the position and size of each positive sample in the image. A typical positive sample description file is as follows:

Posdata/1(10).bmp 1 1 1 23 23
Posdata/1(112.16.bmp 1 1 1 23 23
Posdata/1(122.16.bmp 1 1 1 23 23

However, you can place the description file under your posdata path (that is, the positive sample path), so that you do not need to add the previous relative path. Similarly, it can be generated using the negative sample description file generation method. Finally, use the TXT replacement tool to replace "BMP" with "BMP 1 1 1 23 23 ".
If you have many sample images, replacing them with TXT will cause the program to fail to respond. You can copy the content to the word and replace it. The five digits after BMP indicate the number of images, the starting position of the target and its width and height. In this way, a positive sample description file posdata. dat is generated.

2) run the createsamples program. If you run the program directly in the VC environment, you can set the running parameters in the program arguments column on the project/settings/debug properties page. The following is an example of running parameters:
-Info D:/face/posdata. dat-vec d:/face/POS. Vec-num 50-W 20-H 20
Indicates that there are 50 samples, the sample width is 20, and the height is 20. The positive sample description file is posdata. dat, and the result is output to POS. Vec.

Or enter the following in DOS:

"D:/program files/opencv/bin/createsamples.exe"-Info "posdata/posdata. dat"-VEC data/POS. Vec-num 50-W 20-H 20
After running, A *. VEC file is generated under D:/face/data. This file contains the number of positive samples, width and height, and all sample image data. Result:

The command line parameters of the createsamples program:
Command line parameters:
-VEC <vec_file_name>
Output file name of the trained positive sample.
-IMG <image_file_name>
Source Target Image (for example, a company icon)
-BG <background_file_name>
Background description file.
-Num <number_of_samples>
The number of positive samples to be produced is the same as the number of positive images.
-Bgcolor <background_color>
Background color (assuming that the current image is a grayscale image ). The background color is transparent. For compressed images, the color variance is specified by the bgthresh parameter. The pixels between bgcolor-bgthresh and bgcolor + bgthresh are considered transparent.
-Bgthresh <background_color_threshold>
-Inv
If specified, the color will be reversed.
-Randinv
If specified, the color will be reversed.
-Maxidev <max_intensity_deviation>
The maximum deviation of the background color.
-Maxangel <max_x_rotation_angle>
-Maxangle <max_y_rotation_angle>,
-Maxzangle <max_x_rotation_angle>
The maximum rotation angle, in radians.
-Show
If this parameter is specified, each sample is displayed. Pressing "ESC" will disable this function, that is, the sample image is not displayed, And the creation process continues. This is a useful debug option.
-W <sample_width>
Width of the output sample (in pixels)
-H sample_height
The height of the output sample, in pixels.
The first step is to complete the sample training. Congratulations, you have learned how to train a classifier. It takes me a day to learn this by myself. It may take you a few minutes to learn it.
Iii. Training Classifier

After the sample is created, train the classifier. This process is implemented by the haartraining program. The program source code is provided by opencv and the executable program is in the bin directory of the opencv installation directory.
The command line parameters of haartraining are as follows:
-Data <dir_name>
Stores the path name of the trained classifier.
-VEC <vec_file_name>
Positive sample file name (created by the trainingssamples program or by other methods)
-BG <background_file_name>
Background description file.
-NPOs <number_of_positive_samples>,
-Nneg <number_of_negative_samples>
A positive/negative sample used to train each classifier stage. Reasonable Value: NPOs = 7000; nneg = 3000
-Nstages <number_of_stages>
Number of training phases.
-Nsplits <number_of_splits>
Determines the weak Classifier Used for the phase classifier. If 1, a simple stump classifier is used. If it is 2 or more, the cart classifier with number_of_splits internal nodes is used.
-MEM <memory_in_mb>
Pre-calculated available memory in MB. The larger the memory, the faster the training speed.
-Sym (default)
-Nonsym
Specifies whether the target object for training is vertically symmetric. Vertical symmetry increases the training speed of the target. For example, the front is vertical symmetric.
-Minhitrate: min_hit_rate
The minimum hit rate required by each stage classifier. The total hit rate is the number_of_stages power of min_hit_rate.
-Maxfalsealarm <max_false_alarm_rate>
The maximum error alarm rate for a stage classifier. The total error warning rate is the number_of_stages power of max_false_alarm_rate.
-Weighttrimming <weight_trimming>
Specifies whether or not to use the permission correction and how much to use the permission correction. A basic choice is 0.9
-Eqw
-Mode <Basic (default) | core | all>
Select the type of the Haar feature set used for training. Basic only uses vertical features. All uses vertical and 45-degree rotation features.
-W sample_width
-H sample_height
The size of the training sample, in pixels ). The size must be the same as that created by the training sample.
An example of training a classifier:
"D:/program files/opencv/bin/haartraining.exe"-data Data/Cascade
-VEC data/POS. Vec-BG negdata/negdata. dat-NPOs 49-nneg 49-MEM 200
-Mode all-W 20-H 20

After training, some subdirectories are generated under the directory data, that is, the trained classifier.

The training result is as follows:

 

Congratulations, you have learned to train the classifier.

4. Use a trained classifier for target detection.

This step is used in performance.exe. The program source code is provided by opencv and the executable program is in the bin directory of the opencv installation directory.

Performance.exe-data Data/cascade-Info posdata/test. dat-W 20-H 20-Rs 30

The performance command line parameters are as follows:

Usage:./performance
-Data <classifier_directory_name>
-Info <collection_file_name>
[-Maxsizediff <max_size_difference = 1.500000>]
[-Maxposdiff <max_position_difference = 0.300000>]
[-SF <scale_factor = 1.200000>]
[-Ni]
[-Nos <number_of_stages =-1>]
[-RS <roc_size = 40>]
[-W <sample_width = 24>]
[-H <sample_height = 24>]

You can also use the cvhaardetectobjects function of opencv for detection:

Cvseq * faces = cvhaardetectobjects (IMG, cascade, storage,
1.1, 2, cv_haar_do_canny_pruning,
Cvsize (40, 40); // 3. Detect a face
Note: Some versions of opencv can directly convert the classifiers in these directories into XML files. However, in actual operations, the haartraining program never seems to stop, and does not
An XML file was generated. Later, a haarconv program was found on the Yahoo forum of opencv to convert the classifier into an XML file. The reason remains to be studied.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.