Pose (computer vision) And Pose estimation

來源:互聯網
上載者:User
Pose (computer vision)From Wikipedia, the free encyclopediaJump to:
navigation,
search

In
computer vision and in
robotics, a typical task is to identify specific objects in an image and to determine each object's position and orientation relative to some coordinate system. This information can then be used, for example, to allow a robot to manipulate an object or
to avoid moving into the object. The combination of position and orientation is referred to as the
pose of an object, even though this concept is sometimes used only to describe the orientation.
Exterior orientation and Translation are also used as synonyms to pose.

The image data from which the pose of an object is determined can be either a single image, a stereo image pair, or an image sequence where, typically, the camera is moving with a known speed. The objects which are considered can be rather general, including
a living being or body parts, e.g., a head or hands. The methods which are used for determining the pose of an object, however, are usually specific for a class of objects and cannot generally be expected to work well for other types of objects.

The pose can be described by means of a rotation and translation transformation which brings the object from a reference pose to the observed pose. This rotation transformation can be represented in different ways, e.g., as a
rotation matrix or a
quaternion.

[edit]
Pose estimation
It has been suggested that this article or section be

merged with 3D Pose Estimation. (Discuss)
Proposed since March 2010.

The specific task of determining the pose of an object in an image (or stereo images, image sequence) is referred to as
pose estimation. The pose estimation problem can be solved in different ways depending on the image sensor configuration, and choice of methodology. Three classes of methodologies can be distinguished:

  • Analytic or geometric methods: Given that the image sensor (camera) is calibrated the mapping from 3D points in the scene and 2D points in the image is known. If also the geometry of the object is known, it means that the projected image of the object on
    the camera image is a well-known function of the object's pose. Once a set of control points on the object, typically corners or other feature points, has been identified it is then possible to solve the pose transformation from a set of equations which relate
    the 3D coordinates of the points with their 2D image coordinates.
  • Genetic algorithm methods: If the pose of an object does not have to be computed in real-time a

    genetic algorithm may be used. This approach is robust especially when the images are not perfectly calibrated. In this particular case, the pose represent the

    genetic representation and the error between the projection of the object control points with the image is the

    fitness function.
  • Learning-based methods: These methods use artificial learning-based system which learn the mapping from 2D image features to pose transformation. In short, this means that a sufficiently large set of images of the object, in different poses, must be presented
    to the system during a learning phase. Once the learning phase is completed, the system should be able to present an estimate of the object's pose given an image of the object.

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.