Robot vision is a research method to deal with problems. After a long period of development, robot vision in the positioning, identification, detection and other aspects of the development of various methods. It takes the common camera as the tool, takes the image as the processing medium, obtains the environment information.
1. Camera model
The camera is the main weapon of robot vision and the medium of communication between robot vision and environment. The mathematical model of the camera is a small hole model, the core of which is the solution of similar triangles. There are three notable places to watch:
1.1/F = 1/a + 1/b
The focal length equals the object distance plus the image distance. This is the imaging theorem, which satisfies this condition in order to become a clear image.
1.2 x = x * f/z
If you change the focus f continuously and move the camera at the same time to change Z, you can make the object x the number of pixels on the image unchanged (x). This is the dollyzoom principle. If an object is behind the object (greater Z), use this principle to arbitrarily adjust the proportions of the two objects on the photo.
1.3 The longer the focal length, the smaller the field of view, the more distant objects can be photographed clearly. The photo will also have a greater depth of field.
2, Vanishing Point
The vanishing point is unique in the photo. This point does not exist directly in the photo and does not exist directly in reality. Because of the projective transformation, the parallel lines in the photo will have a tendency to intersect. If a parallel line is found at the intersection of the image, the point corresponds to a point in the reality that is infinitely far away. The image coordinates for this point are [X1 X1 1]. This point becomes vanishing point. The connection between the camera's light heart and vanishing point points to the direction of vanishing point in the camera's coordinate system.
In addition, vanishing points in various directions on the same plane make up a straight line in the image, called a horizon. This principle can be used to measure the height of a person standing on the ground. It is important to note that only the camera level, the height of the Horizen is the camera height.
2.1 Position Estimation
If we can get the 2 vanishing points in a picture. and the direction of the 2 vanishing points is perpendicular to each other (grid), then we can estimate the attitude of the camera relative to this image (target pose estimation). After obtaining the camera's rotation vector relative to the target, if the camera's internal parameters are known and the projective transformation matrix is known, the distance of the camera relative to the target can be calculated, then the position of the robot is estimated. h = k^-1* (H-projective matrix)
2.2-Point Line Dual
P1XP2 = L12
L12xl23 = P2
3. Projective transformation
Projective change is a kind of transformation of plane---> Plane in space. The projective transformations are expressed in the homogeneous coordinate and the arbitrary invertible matrix H. In short, it can be expressed as a = HB, where AB is the second coordinate of the form [X Y 1]. One of the major functions of a projective transformation is to project a shape into other shapes. For example, make a billboard in a photo, or a billboard in a game broadcast, or a flag that biu when the swimmer arrives. Projective transformation is also the foundation of augmented reality technology.
The core of projective transformation is the extraction of H. The common solution method is see the machine vision textbook.
Suppose that the four points of a flat photograph are a (0,0,1), B (0,1,1), C (1,1,1), D (1,0,1). Obviously, these four points need to be projected into an image area of four of our known pixel-bit coordinates. In addition, we can calculate two interesting points based on Pixel location, V1 (x1, Y1, Z1), V2 (X2,Y2,Z2). Both of these points are image points. Their corresponding actual coordinates are assumed to be (0,1,0), (1,0,0). Then we have three very interesting practical points. respectively (1,0,0), (0,1,0), (0,0,1). It happens to be an identity Matrix. These three actual coordinates are transformed by a projective transformation to get pixel coordinates. Pixel coordinates are also known. Then the first column of H should correspond to Beta*v2, and the second column should correspond to ALPHA*V1. The third column should correspond to the pixel coordinates of gama* "a". Alpha Beta Gama is a constant. "The coordinates after the projective change should be constant multiplied by the second coordinate".
If Alpha Beta Gama can be solved, then we get the projective transformation matrix. Obviously the pixel coordinates of C point are brought into the equation, we have 3 equations, 4 unknowns (introduced a lamda). But Lamda does not affect, in addition to the past, we just need to ALPHA/LAMDA,BETA/LAMDA,GAMA/LAMDA as unknown to remove the projective matrix.
Therefore, the first column of the projective transformation matrix represents the vanishing point V1, the second column represents the vanishing point V2, the first column and the second column of the cross multiply, representing the horizontal line equation (point lines dual).
Robotics-Robot Vision (Basic)