RecognitionAlgorithmOverview:
Sift/Surf is based on grayscale images,
1. First, create an image pyramid to form a three-dimensional image space. Use the Hessian matrix to obtain the local maximum value of each layer, and then perform NMS at 26 points around the Extreme Point, in this way, a rough feature point is obtained, and the layer (scale) of the precise feature point is obtained by quadratic interpolation, that is, the scale is not changed.
2. Select a neighboring area corresponding to the scale in the feature points to find the main direction. In the area, sift calculates the gradient direction of all points in a square area, find the direction that accounts for more than 80% as the main direction, while surf selects the circular area, and uses the method of the active sector to find the main direction of the feature point, which is aligned with the main direction to complete the rotation unchanged.
3. Coordinates can be created for each feature point based on the axis in the main direction. Sift selects a Square area corresponding to the scale in the feature point and divides it into 16 parts, calculate the proportion of each piece in the eight directions. The feature points form a 128-bit feature vector, and the intensity remains unchanged when the image is normalized. The surf is divided into 64 parts, calculate the sum of the DX, Dy, | DX |, | dy | values in the same shape as the 128-dimension vector. Then, the contrast and intensity remain unchanged after normalization.
Haar features are also based on grayscale images,
First, a classifier is trained through a large number of object images with obvious Haar features (rectangles) using pattern recognition. The classifier is cascade, each level is retained to the next level of candidate objects with the same recognition rate, and each level of sub-classifier is composed of many Haar features (calculated from the integral image, and save the location), which has a horizontal, vertical, skewed, and each shard has a threshold value and two branch values. Each sub-classifier has a total threshold value. When recognizing an object, we also calculate the integral image to prepare for the subsequent calculation of Haar features. Then, we traverse the entire image using a window of the same size as the object window during training, and gradually enlarge the window, traverse the search object. When the window moves to a position, the Haar feature in the window is calculated. After weighting, compare it with the threshold of the Haar feature in the classifier to select the left or right branch value, if you accumulate a level of branch value and compare it with the threshold value of the corresponding level, you can enter the next round of screening only when the threshold value is greater than this threshold. When the classifier is used, it indicates that the object is recognized with a high probability.
The Generalized HOUGH transformation is also based on grayscale images,
Using the contour as the feature, the gradient information is integrated and the object is identified by voting.ArticleI will not go into details here.
Comparison of features, similarities and differences and their applicability:
All three algorithms are feature methods based on the intensity (gray scale) information. However, the sift/Surf feature is a feature with strong directionality and brightness, which makes it suitable for rigid deformation, the Haar feature recognition method has the meaning of a little artificial intelligence, and is most suitable for objects with obvious and stable structures such as human faces, as long as the structure is relatively fixed, non-linear deformation such as distortion can still be identified; Generalized HOUGH transformation is completely accurate matching, and parameter information such as the object location and direction can be obtained. The first two methods are based on obtaining local features first and then matching them one by one, but the calculation methods of local features are different. Sift/Surf is complex and stable, and the Haar method is relatively simple, biased towards a statistical method to form features, which also gives it a certain degree of fuzzy elasticity; Generalized HOUGH transformation is a global feature-contour gradient, however, it can also be seen that the position and gradient of each point in the entire contour are all features, and each point contributes to recognition. You can use an intuitive vote to determine whether to identify an object based on the number of votes.