Robotics-Robot Vision (features)

Last Update:2016-05-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

On the last point, the core task of robot vision is estimation, and the theoretical framework is projective geometry theory. The great power of the single-should transform was demonstrated in previous assignments. However, the first condition of the entire estimation is the known pixel coordinates, especially the pixel coordinates of the corresponding points in multiple graphs.

Single image processing methods are da Lu Huo, do not repeat. This blog wants to talk about the invariant point detection and invariant features. Since the robot is constantly moving, it is possible to shoot the same object from different directions. And the distance of shooting is far and near, Angle has titled. Because of the nature of the projective transformation itself, it is impossible to ensure that the objects in the two images look the same. So we need a feature extraction method (feature point detection) to ensure that the detection is rotation, scaling unchanged. In addition to a feature description method, the same rotation and scaling are unchanged.

1. Sift feature extraction

Sift feature extraction can be divided into the following steps, 1, Multiscale convolution, 2, structural pyramids, 3, 3D non-maximum value suppression

The function of Multiscale convolution is to construct a from near image. The pyramid is constructed from a lower sample. See the previous blog for this section.

For images of different scales with one pixel, we can track its "grayscale" changes. We found that the scale of the maximum response (post-convolution grayscale) would be the eigenvalues of this point if the response to a different sigma template was different. It's kind of like a mechanical structure to give different frequencies of excitation, a certain frequency will resonate, we can record this frequency to some extent represents this structure (simple pendulum frequency only and ml, with F can reproduce the system).

So, if we find a suitable template (excitation mode), and then find the maximum response, we can get the intrinsic scale of each point in the picture (intrinsic scales). The same object will respond uniformly at intrinsic scale after shooting at different distances. This solves the problem that the scale does not change.

3D non-maximal value suppression means that only the maximum response is taken as a feature point within the 3*3*3 neighborhood of a point. Because the point is the strongest response in a spatial neighborhood, the point is also rotated. From all directions, the point responds strongest.

2. Sift characteristic description

Feature extraction and feature description are actually different. Feature extraction has ended in the previous section. If there are two pictures, then the same feature points will certainly be found. The function of feature description is to prepare for the match, which is the same characteristic point in two images, which is the standard of the local area information of the characteristic point. The essence of a feature is a high-dimensional vector. The scale is constant and the rotation is constant.

The hog feature is used here. Feature description can be divided into two steps: 1, local main direction determination, 2, gradient histogram calculation

It is a reasonable idea to use Sigma as a feature to describe the selection range, because Sigma describes the scale, the location of the feature point + the scale = The local information of the feature point. On this basis, the gradient direction of all the pixels in the field is counted, and the hog feature is constructed by the direction statistic histogram as the eigenvector. It is important to align the main direction of the image and the x-axis direction before the direction of the statistic. As follows:

The yellow something like a clock is a feature point +scale, and the pointer represents the primary direction (PCA) of the small image of the piece. The green is the bin of the histogram, which is used to calculate the eigenvector.

Finally, we can get the corresponding point pairs of image 1---Image 2 by matching eigenvector, and the two images can be stitched together by the calculation of the single-matrix. 3D reconstruction can be performed if the calibration information is known.

Robotics-Robot Vision (features)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Robotics-Robot Vision (features)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Robotics-Robot Vision (features)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support