Human Face recognition
Starting with OpenCV2.4, we have added a new class Facerecognizer that we can use to easily perform face recognition experiments. Its source code can be found under OPENCV\MODULES\CONTRIB\DOC\FACEREC\SRC in OpenCV.
The currently supported algorithms are:
Eigenfaces feature face Createeigenfacerecognizer ()
Fisherfaces Createfisherfacerecognizer ()
Local Binary Patterns histograms partial two value histogram Createlbphfacerecognizer ()
Automatic face recognition is how to extract meaningful features from an image, put them into a useful representation, and then classify them in some way.
The feature Face method describes a comprehensive approach to identifying faces: The face image is a point where the point is to find its representation in a low-dimensional space from a high-dimensional image space, so that classification becomes simple. The low subspace space low dimension is found using principal element analysis (Principal Component ANALYSIS,PCA), which can find the axis with the largest variance. Although the conversion was considered from the perspective of optimal reconstruction, he did not take the labelling issue into account. Imagine a situation where the change is based on an external source, such as light. The maximum variance of an axis does not necessarily contain any identifiable information, so the classification at this time is not possible. Therefore, a specific class projection method using linear discriminant (Linear discriminant Analysis,lda) is proposed to solve the problem of face recognition. One of the basic ideas is to minimize the variance in the class, so that the difference between the class is the largest.
In recent years, various local feature extraction methods have appeared. In order to avoid the high-dimensional data of the imported images, the method of using only local features to describe the images is presented, and the extracted features (hopefully) are more robust for local occlusion, illumination change, small sample, etc. The methods for local feature extraction are Gueber (Gabor waelets), discrete Fourier transform (discrete cosinus transform,dct), local two-value mode (locally Binary PATTERNS,LBP). The method used to extract the local characteristics of time domain space is still an open research problem, because spatial information is potentially useful information.
Local two value pattern histogram local Binary Patterns histograms
Since Eigenfaces and Fisherfaces have two ways to retrain when new face data is introduced, here I will focus on the content of the LBP feature.
Eigenfaces and Fisherfaces use a holistic approach to face recognition [GM: Use all pixels directly]. You think of your data as a high-dimensional vector of image space. We all know that high-dimensional data is bad, so a low subspace space is determined and may be good for information preservation. Eigenfaces is the maximization of the total divergence, which can lead to the classification of the principal component of the maximum variance when the variance is generated by external conditions. So for the use of some discriminant analysis, we use the LDA method to optimize. The Fisherfaces method works well, at least in the limited case of the model we assume.
The real life is not perfect. You cannot guarantee that the lighting conditions in your image are perfect, or that there are 10 photos of 1 people. So, what if there's only one picture per person? The covariance estimation method of our subspace may be completely wrong, so recognition may also be wrong.
Some studies focus on the extraction of local image features. The idea is that we don't think of the whole image as a high-dimensional vector, only a local feature to describe an object. By extracting features in this way, you will get a low-dimensional implicit. A good idea! But you quickly find that this method of image representation is not only subject to light changes. You think about the scale change, the deformation, the rotation in the image-our local representation is at least robust for these situations. Just as the SIFT,LBP method is important in 2D texture analysis. The basic idea of LBP is to sum the pixels of an image with the contrast of its local surrounding pixels. This pixel is used as the center to compare the threshold values of neighboring pixels. If the luminance of the center pixel is greater than or equal to his neighboring pixels, mark him as 1, otherwise Mark 0. You will use binary numbers to represent each pixel, such as 11001111. So, since the surrounding 8 pixels, you may end up getting 2^8 a possible combination, called the local two value pattern, sometimes called LBP code. The first LBP operator described in the literature is actually using the neighborhood of the 3*3.
Algorithm description
A more formal LBP operation can be defined as:
where (XC,YC) is the center pixel, the brightness is the IC, and in is the brightness of the neighboring pixels. S is a symbolic function.
This method of description allows you to capture the details of the image very well. In fact, researchers can use it to get the most advanced level of texture classification. As the method described earlier has been proposed, the fixed neighbor region has failed to encode the scale change. Therefore, the extension method using a variable is to encode the nearest neighbor pixel with a circle of variable radii, so that the following neighbors can be captured:
To a given point (XC,YC), his nearest neighbor point (XP,YP), p∈p can be computed by the following:
where r is the radius of the circle, and P is the number of sample points.
This operation is an extension of the original LBP operator, so it is sometimes called extended LBP (also known as circular LBP). If a point on the circle is not on the image coordinate, we use his interpolation point. Computer science has a bunch of clever interpolation methods, and OpenCV uses bilinear interpolation.
The LBP operator is robust for monotone changes in grayscale. We can see the LBP images of the manually changed images.
So the rest is how to merge the spatial information for the face recognition model. for LBP images into m blocks, each block extracts a histogram. Space-enhanced eigenvectors can be obtained by connecting local special histograms (rather than merging). These histograms are called local two-value pattern histograms.
Source code Analysis LBPH class declaration
classLBPH: Publicfacerecognizer{Private:int_grid_x;int_grid_y;int_radius;int_neighbors;Double_threshold; vector<Mat>_histograms; Mat _labels;//Computes a LBPH model with images in SRC and //corresponding labels in labels, possibly preserving //old model data. voidTrain (inputarrayofarrays SRC, inputarray labels,BOOLPreservedata); Public:usingFacerecognizer::save;usingFacerecognizer::load;//initializes this LBPH Model. The current implementation is rather fixed //As it uses the Extended Local Binary Patterns per default. // //Radius, neighbors is used in the local binary patterns creation. //grid_x, grid_y control the grid size of the spatial histograms.LBPH (intradius_=1,intneighbors_=8,intgridx=8,intgridy=8,DoubleThreshold = Dbl_max): _grid_x (GridX), _grid_y (GridY), _radius (Radius_), _neighbors (Neighbors_ ), _threshold (threshold) {}//Initializes and computes this lbph Model. the current implementation is //Rather fixed as it uses the Extended Local Binary Patterns per default. // //(Radius=1), (neighbors=8) is used in the local binary patterns creation. //(Grid_x=8), (grid_y=8) controls the grid size of the spatial histograms.LBPH (inputarrayofarrays src, inputarray labels,intradius_=1,intneighbors_=8,intgridx=8,intgridy=8,DoubleThreshold = Dbl_max): _grid_x (GridX), _grid_y (GridY), _radius (Radius_), _neighbors (Neighbors_), _threshold (threshold) {train (src, labels); } ~LBPH () {}//Computes a LBPH model with images in SRC and //corresponding labels in labels. voidTrain (inputarrayofarrays SRC, inputarray labels);//Updates This LBPH model with images in SRC and //corresponding labels in labels. voidUpdate (inputarrayofarrays src, inputarray labels);//predicts the label of a query image in SRC. intPredict (Inputarray src)Const;//predicts the label and confidence for a given sample. voidPredict (Inputarray _src,int&label,Double&dist)Const;//See facerecognizer::load. voidLoadConstfilestorage& FS);//See Facerecognizer::save. voidSave (filestorage& FS)Const;//Getter functions. intNeighbors ()Const{return_neighbors; }intRadius ()Const{return_radius; }intGrid_x ()Const{return_grid_x; }intGrid_y ()Const{return_grid_y; } algorithminfo* info ()Const;};
Building LBPH Instances
//StatementPtr<facerecognizer> Createlbphfacerecognizer (intradius=1,intneighbors=8,intgrid_x=8,intgrid_y=8,DoubleThreshold=dbl_max);//DefinitionPtr<facerecognizer> Createlbphfacerecognizer (intRadiusintNeighbors,intGrid_x,intGrid_y,DoubleThreshold) {return NewLBPH (RADIUS, neighbors, grid_x, grid_y, Threshold);}
parameter Description:
* Radius: This parameter is used to construct the round LBP feature.
* Neighbors: This parameter is the number of neighboring pixels required to construct a round LBP feature, typically 8 sample points. The more sampling points, the more expensive the calculation.
* grid_x: This parameter is the number of lattice blocks divided horizontally, the general is 8. The more chunks, the higher the dimension of the eigenvectors of the final build result.
* Grid_y: This parameter is the number of lattice blocks in the vertical direction, the general is 8.
* Threshold: This threshold value is used for prediction. If the nearest neighbor's distance is greater than the threshold, the predicted method returns-1.
The training process of LBPH
The following gives the source code of LBPH training function train, and then carries on the analysis.
voidLbph::train (inputarrayofarrays _in_src, Inputarray _in_labels,BOOLPreservedata) {if(_in_src.kind ()! = _inputarray::std_vector_mat && _in_src.kind ()! = _inputarray::std_vector_vector) {stringError_message ="The images is expected as Inputarray::std_vector_mat (a std::vector<mat>) or _inputarray::std_vector_vector (a std::vector< vector<...> >). "; Cv_error (Cv_stsbadarg, error_message); }if(_in_src.total () = =0) {stringerror_message = Format ("Empty training data was given. You'll need more than one sample to learn a model. "); Cv_error (Cv_stsunsupportedformat, error_message); }Else if(_in_labels.getmat (). Type ()! = CV_32SC1) {stringerror_message = Format ("Labels must be given as Integer (CV_32SC1). Expected%d, but is%d. ", CV_32SC1, _in_labels.type ()); Cv_error (Cv_stsunsupportedformat, error_message); }//Get the vector of matrices vector<Mat>Src _in_src.getmatvector (SRC);//Get the label matrixMat labels = _in_labels.getmat ();//Check if data is well-aligned if(Labels.total ()! = Src.size ()) {stringerror_message = Format (the number of samples (SRC) must equal the number of labels (labels). Was Len (samples) =%d, Len (labels) =%d. ", Src.size (), _labels.total ()); Cv_error (Cv_stsbadarg, error_message); }//If this model should is trained without preserving old data, delete old model data if(!preservedata) {_labels.release (); _histograms.clear (); }//Append labels to _labels matrix for(size_t Labelidx =0; Labelidx < Labels.total (); labelidx++) {_labels.push_back (labels.at<int> (int) (LABELIDX)); }//Store the spatial histograms of the original data for(size_t Sampleidx =0; Sampleidx < Src.size (); sampleidx++) {//Calculate LBP imageMat lbp_image = ELBP (Src[sampleidx], _radius, _neighbors);//Get spatial histogram from this LBP imageMat p = Spatial_histogram (Lbp_image,/ * Lbp_image * / static_cast<int> (STD::POW(2.0,static_cast<Double> (_neighbors))),/* Number of possible patterns * /_grid_x,/ * Grid size X * /_grid_y,/ * Grid size Y * / true);//Add to Templates_histograms.push_back (P); }}
The training process is divided into the following processes:
- First, make the necessary error check, get the face image vector and tag vector.
- Calculate LBP Images
- Get spatial histogram based on LBP image
- Incorporating a spatial histogram matrix into a private variable _histograms vector
the process of generating the LBP spatial histogram:
* The ELBP function is used to generate LBP images.
* The Spatial_histogram function is used to block LBP images and histogram statistics for each chunk.
The prediction process of LBPH
The following gives the LBPH predictive function predict source code, and then analysis.
voidLBPH::p redict (Inputarray _src,int&minclass,Double&mindist)Const{if(_histograms.empty ()) {//Throw error if no data (or simply return-1?) stringError_message ="This LBPH model was not computed yet. Did you call the train method? "; Cv_error (Cv_stsbadarg, error_message); } Mat src = _src.getmat ();//Get the spatial histogram from input imageMat lbp_image = ELBP (src, _radius, _neighbors); Mat query = Spatial_histogram (Lbp_image,/ * Lbp_image * / static_cast<int> (STD::POW(2.0,static_cast<Double> (_neighbors))),/* Number of possible patterns * /_grid_x,/ * Grid size X * /_grid_y,/ * Grid size Y * / true / * normed histograms * /);//Find 1-nearest neighborMindist = Dbl_max; Minclass =-1; for(intSampleidx =0; Sampleidx < _histograms.size (); sampleidx++) {DoubleDist = Comparehist (_histograms[sampleidx], query, CV_COMP_CHISQR);if((Dist < mindist) && (Dist < _threshold)) {mindist = dist; Minclass = _labels.at<int> (SAMPLEIDX); } }}
The prediction process is relatively simple, first to query the point image of LBP code and generate a spatial histogram, and then linear violence to calculate the distance of the histogram, the final output distance of the smallest prediction category.
comparehist function
Use the Cv::comparehist function to evaluate how different two histograms are, or how similar, to return the measurement distance.
The method of similarity measurement currently supports 4 kinds:
–cv_comp_correl correlation correlation coefficient, the same as 1, the similarity range is [1, 0)
–CV_COMP_CHISQR chi-square card side, same as 0, similarity range of [0, +inf)
–cv_comp_intersect intersection histogram intersection, number larger the more similar, the similarity range is [0, +inf)
–cv_comp_bhattacharyya Bhattacharyya Distance do the normal difference Bhattacharyya distance, the same is 0, the similarity range is [0, +inf)
Save and load functions
OpenCV has a set of its own class to handle the file storage, you can access the corresponding parameters in the form of Key-value.
voidLbph::load (Constfilestorage& FS) {fs["radius"] >> _radius; fs["Neighbors"] >> _neighbors; fs["grid_x"] >> _grid_x; fs["Grid_y"] >> _grid_y;//read matricesReadfilenodelist (fs["Histograms"], _histograms); fs["Labels"] >> _labels;}//See Cv::facerecognizer::save.voidLbph::save (filestorage& FS)Const{FS <<"radius"<< _radius; FS <<"Neighbors"<< _neighbors; FS <<"grid_x"<< _grid_x; FS <<"Grid_y"<< _grid_y;//write matricesWritefilenodelist (FS,"Histograms", _histograms); FS <<"Labels"<< _labels;}
reprint Please indicate the author Jason Ding and its provenance
GitHub home page (http://jasonding1354.github.io/)
CSDN Blog (http://blog.csdn.net/jasonding1354)
Jane Book homepage (http://www.jianshu.com/users/2bd9b48f6ea8/latest_articles)
"Computer vision" OPENCV face recognition Facerec Source code Analysis 2--LBPH Overview