Overview of facial feature point positioning
Facial feature point positioning task is based on the input of the face image, automatically locate the key feature points, such as eyes, nose, mouth point, eyebrows and face parts contour points, as shown in the figure below.
This technology is widely used, such as automatic face recognition, facial expression recognition and automatic synthesis of face animation. Because of the influence of different posture, expression, illumination and occlusion, it seems very difficult to locate each key feature point accurately. We simply analyze this problem, it is not difficult to find that this task can actually split three sub-problems:
1. How to model the face View image (input)
2. How to model the face shape (output)
3. How to establish the association between face-view image (model) and face shape (model)
In the past, the research work is inseparable from these three aspects. The typical methods of face shape modeling are deformable templates (deformable template), point distribution model (active shape model), graph model, and so on.
The human face modeling can be divided into global apparent modeling and local apparent modeling. Global apparent modeling simple is to consider how to model the whole face of the apparent information, the typical method has active apparent model active appearance models (production model) and boosted appearance model (discriminant models). The corresponding local apparent modeling is to model the apparent information of the local area, including color model, projection model, and side-line models.
Recently, the Cascade shape regression model has made a significant breakthrough in the task of locating the feature point, which uses the regression model to directly learn the mapping function from the human face to the face shape (or the parameters of the face shape model), and then establishes the correspondence relation from the apparent to the shape. Such methods do not require complex face shape and apparent modeling, simple and efficient, in the controllable scene (in laboratory conditions of the face) and non-controllable scenes (network face image, etc.) have achieved good positioning effect. In addition, the method of facial feature point localization based on deep learning has obtained remarkable results. Deep learning combined with shape regression framework can further improve the precision of the positioning model and become one of the mainstream methods of current feature positioning. I will introduce the research progress of two kinds of methods, cascade shape regression and deep learning. cascaded linear regression model
Facial feature point location problem can be seen as learning a regression function f F, with image I I as input, output θθ as the location of the feature point (face shape):
Θ=f (i) =FN (fn−1 (... f1 (θ0,i), i), i) \theta=f\left (i \right) =f_{n}\left (F_{n-1}\left (... f_{1}\left (\theta_{0}, I\r ight), i\right), i\right)
Θi=fi (θi−1,i), I=1,..., n \theta_{i}=f_{i}\left (\theta_{i-1},i\right), I=1,..., n
The so-called Cascade, that is, the input of the current function fi f_{i} depends on the output θi \theta_{i} of the previous function fi−1 f_{i-1}, and the learning goal of each fi f_{i} is to approximate the real position of the feature point Θ\theta,θ0 \theta_{0} as the initial shape. In general, fi−1 f_{i-1} is not a direct regression to the real position Θ\theta, but the difference between returning the current shape θi−1 \theta_{i-1} and the actual position Θ\theta:
Δ