Robotics-Robot Vision (features)

Source: Internet
Author: User

On the last point, the core task of robot vision is estimation, and the theoretical framework is projective geometry theory. The great power of the single-should transform was demonstrated in previous assignments. However, the first condition of the entire estimation is the known pixel coordinates, especially the pixel coordinates of the corresponding points in multiple graphs.

Single image processing methods are da Lu Huo, do not repeat. This blog wants to talk about the invariant point detection and invariant features. Since the robot is constantly moving, it is possible to shoot the same object from different directions. And the distance of shooting is far and near, Angle has titled. Because of the nature of the projective transformation itself, it is impossible to ensure that the objects in the two images look the same. So we need a feature extraction method (feature point detection) to ensure that the detection is rotation, scaling unchanged. In addition to a feature description method, the same rotation and scaling are unchanged.

1. Sift feature extraction

Sift feature extraction can be divided into the following steps, 1, Multiscale convolution, 2, structural pyramids, 3, 3D non-maximum value suppression

The function of Multiscale convolution is to construct a from near image. The pyramid is constructed from a lower sample. See the previous blog for this section.

  

For images of different scales with one pixel, we can track its "grayscale" changes. We found that the scale of the maximum response (post-convolution grayscale) would be the eigenvalues of this point if the response to a different sigma template was different. It's kind of like a mechanical structure to give different frequencies of excitation, a certain frequency will resonate, we can record this frequency to some extent represents this structure (simple pendulum frequency only and ml, with F can reproduce the system).

So, if we find a suitable template (excitation mode), and then find the maximum response, we can get the intrinsic scale of each point in the picture (intrinsic scales). The same object will respond uniformly at intrinsic scale after shooting at different distances. This solves the problem that the scale does not change.

3D non-maximal value suppression means that only the maximum response is taken as a feature point within the 3*3*3 neighborhood of a point. Because the point is the strongest response in a spatial neighborhood, the point is also rotated. From all directions, the point responds strongest.

2. Sift characteristic description

Feature extraction and feature description are actually different. Feature extraction has ended in the previous section. If there are two pictures, then the same feature points will certainly be found. The function of feature description is to prepare for the match, which is the same characteristic point in two images, which is the standard of the local area information of the characteristic point. The essence of a feature is a high-dimensional vector. The scale is constant and the rotation is constant.

The hog feature is used here. Feature description can be divided into two steps: 1, local main direction determination, 2, gradient histogram calculation

It is a reasonable idea to use Sigma as a feature to describe the selection range, because Sigma describes the scale, the location of the feature point + the scale = The local information of the feature point. On this basis, the gradient direction of all the pixels in the field is counted, and the hog feature is constructed by the direction statistic histogram as the eigenvector. It is important to align the main direction of the image and the x-axis direction before the direction of the statistic. As follows:

  

The yellow something like a clock is a feature point +scale, and the pointer represents the primary direction (PCA) of the small image of the piece. The green is the bin of the histogram, which is used to calculate the eigenvector.

Finally, we can get the corresponding point pairs of image 1---Image 2 by matching eigenvector, and the two images can be stitched together by the calculation of the single-matrix. 3D reconstruction can be performed if the calibration information is known.

  

Robotics-Robot Vision (features)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.