Recently look at face alignment related articles, the current more popular algorithms are based on the framework (cascaded pose REGRESSION,CPR) [1], the reason for the popularity of the algorithm is simple and efficient. CPR is divided into two sections of training and testing, first of all to introduce the inspection process:
The purpose of face alignment is to estimate the vector face shape, which consists of a vector, where k represents the number of landmark, because each landmark has two coordinates, all coordinates are joined together to form a 2K-length vector, the face shape. The CPR detection process, as shown above, has a total of T stage, the first feature extraction ft in each stage, the use of the shape-indexed features, but also can use such as hog, sift and other artificial design features, or other learning Based features can be learned by looking up the latest CVPR articles, and then by the trained Regressor R estimates get upadate vectors that increment δs, add ΔS to the shape on the previous stage to get a new shape, This allows the final shape to be obtained through continuous iteration. I feel straightforward. This means that the initial shape is moving toward the ground truth shape with constant recursion.
Next is the training process.
The first is the input, n represents the number of samples, I represents the image, Si represents ground truth shape, and the remaining parameter represents initial shape, how is this shape selected? Randomly select 20 other faces from training data ground truth shape as a sample initial shape the number of training samples = The original sample number x20, that is, data augmentation is intended to enlarge Training data and improve gengeralization ability.
The next step is to start the training, in each stage for each sample by ground truth shape and the current shape subtraction calculation Δs, at the first stage, the current shape is initial shape, and then the feature extraction of the FT, Then through the loss function selection error regressor, how to establish regressor and ΔS, you need to see paper, such as Sdm,lbf,ert and so on, I will share in the Post blog.
Finally, the shape of the current stage is obtained using the feature ft and regressor ΔS to the shape on the previous stage, and then the solution for the next stage.
All regressor are saved for testing through the T-stage training.
It can be seen from the above that the main operation in CPR is the addition of vectors, which is not only effective but also has low computational complexity, so it is widely used in face alignment in recent years.
Reference documents
[1] Dollár, P., Welinder, p., Perona, p.: ' cascaded pose regression '. Proc.
IEEE Conf. Computer Vision and Pattern recognition, 2010
cascaded pose regression