Opencv2: geometric transformation of images, translation, mirroring, scaling, and rotation (1)

Last Update:2014-10-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The geometric transformation of an image is the spatial geometric transformation of the image pixels without changing the image content. It mainly includes the translation transformation, image transformation, scaling and rotation of the image. This article first introduces some basic concepts of image geometric transformation, and then implements image translation transformation, image transformation, scaling, and rotation under opencv2, finally, we will introduce the geometric combination transformation (translation + scaling + rotation ).

1. Basic concepts of geometric transformation 1.1 coordinate ing

The geometric transformation of the image changes the spatial location of the pixel, and establishes a ing relationship between the original image pixel and the transformed image pixel. Through this ing relationship, the following two calculations can be achieved:

Calculate the Coordinate Position of the transformed image from any pixel of the original image
Any pixel of the transformed image is at the Coordinate Position of the original image.

For the first calculation, as long as any pixel coordinates on the original image are given, the coordinates of the transformed image can be obtained through the ing relationship. The process of ing the input image coordinates to the output is calledForward ing". In turn, knowing the pixel coordinates of the transformed image, and calculating the pixel coordinates of the original image, the process of ing the output image to the input is called"Backward ing". However, there are some shortcomings when using forward ing to process geometric transformations. there are usually two problems: incomplete ing and overlapping ing.

Incomplete ing
The total number of pixels of the input image is smaller than that of the output image. In this way, some pixels in the output image cannot be mapped to the original image.

Only the four coordinates (), () are mapped to the original image, and the remaining 12 coordinates have no valid values.
Overlapping Mappings
Based on the ing relationship, multiple pixels of the input image are mapped to the same pixel of the output image.

Four pixels (), () and () in the upper left corner are mapped to () of the output image. Then) what is the pixel value?

To solve the above two problems, you can use"Backward ing", The output image coordinates are used to calculate and change the coordinates corresponding to the coordinates in the original image. In this way, each pixel of the output image can find the unique corresponding pixel through the ing relationship in the original image, without the incomplete and overlapping mappings. Therefore, backward ing is generally used to process geometric transformations of images. We can also see from the above that the reason for the forward ing problem is mainly because the total number of pixels in the image has changed, that is, the size of the image has changed. In some transformations where the image size does not change, the forward ing is very effective.

1. 2. Interpolation Algorithm

For digital images, the coordinates of pixels are discrete non-negative integers, but floating point coordinate values may be generated during the conversion process. For example, when the original image coordinate (9, 9) is reduced by a factor of (4.5, 4.5), this is obviously an invalid coordinate. Interpolation algorithms are used to process these floating point coordinates. Common interpolation algorithms include the nearest neighbor interpolation, bilinear interpolation, quadratic cubic Interpolation, and cubic Interpolation. This article mainly introduces the nearest interpolation and bilinear interpolation, and other high-order interpolation algorithms.

Nearest Interpolation
It is also called the zero-order interpolation method. It is the simplest interpolation algorithm, and of course the result is the worst. The concept is quite simple, that is, rounding, the pixel value of the floating point coordinate is equal to the pixel value of the input image closest to the point.

The code above can be used to obtain the nearest interpolation coordinate (u, v) of (x, y ).
The nearest interpolation has almost no redundant operations, and the speed is quite fast. However, this method of adjacent values is rough, which may cause mosaic and sawtooth of the image.
Bilinear interpolation
Its interpolation effect is much better than the nearest interpolation, and the calculation speed is much slower. The main idea of bilinear interpolation is to calculateFloating Point Coordinate pixel Approximation. So how can we calculate the approximate value of floating point coordinates. A floating point coordinate must be surrounded by four integer coordinates. The pixel values of the four integer coordinates can be mixed according to a certain proportion to obtain the pixel values of the floating point coordinate. The mixing ratio is the distance from the floating point coordinate.
Assume that the coordinate is (2.4, 3) the pixel value P, the point is between (2, 3) AND (3, 3), as shown in
U and V are the proportions of the Two integer coordinate pixels closest to the floating point coordinate in the floating point coordinate pixel respectively.
P (2.4, 3) = u * P (2, 3) + V * P (3, 3). the mixing ratio is based on the distance, so u = 0.4, V = 0.6.
The above is just an interpolation of a straight line, called linear interpolation. Bilinear interpolation performs linear interpolation on the X and Y axes respectively.
The following uses cubic linear interpolation to perform double-believe interpolation.

(2.4, 3) pixel value F1 = m * t1 + (1-m) * t2
(2.4, 4) pixel value F2 = m * T3 + (1-m) * T4
(2.4, 3.5) pixel value F = N * F1 + (1-N) * F2
In this way, we can obtain the pixel value of the Floating Point Coordinate (2.4, 3.5.
Calculate the floating-point coordinate pixel F, and set the four pixel values around the floating-point coordinate to T1, T2, T3, and T4 respectively, and the deviation between the floating-point coordinate and the X-coordinate in the upper left corner is M, the ordinate deviation is N.
F1 = m * t1 + (1-m) * t2
F2 = m * T3 + (1-m) * T4
F = N * F1 + (1-N) * F2
The above is the basic formula of bilinear interpolation. It can be seen that 6 floating point operations are required to calculate the pixel value of each pixel. In addition, because floating point coordinates have four coordinates for approximate calculation, if the pixel values of these four coordinates are significantly different, after interpolation, the image will be blurred where the color division is obvious.

2. Image Translation

The image translation transformation adds the specified horizontal offset and vertical offset to all the pixel coordinates of the image. Translation transformation is divided into two types based on whether to change the image size

The size of the left-side translation image occurs, while ensuring image translation, it also saves the complete image information. The size of the translated image on the right is not changed, so the part in the lower right corner of the image is truncated.

2.1 Principle of Translation

If dx is set to horizontal offset, Dy is vertical offset, (x0, y0) is the original image coordinate, (x, y) is the transformed image coordinate, then the coordinate ing of the translation transformation is

This is a forward ing, which maps the coordinates of the original image to the transformed image.
Its inverse transformation is
To map the transformed image coordinates to the original image. In the geometric transformation of an image, backward ing is generally used.

2.2 opencv-based implementation

The implementation of image translation transformation is still very simple. I will not go into details here.

The size of the translated image remains unchanged.

Void geometrictrans: translatetransform (CV: mat const & SRC, CV: mat & DST, int dx, int Dy) {cv_assert (SRC. depth () = cv_8u); const int rows = SRC. rows; const int Cols = SRC. cols; DST. create (rows, cols, SRC. type (); vec3b * P; For (INT I = 0; I <rows; I ++) {P = DST. PTR <vec3b> (I); For (Int J = 0; j <Cols; j ++) {// after translation, the coordinates are mapped to the original image int x = J-DX; int y = I-dy; // ensure that the mapped coordinates are in the original image range if (x> = 0 & Y> = 0 & x <Cols & Y <rows) P [J] = SRC. PTR <vec3b> (y) [x] ;}}

Changes in image size after translation

Void geometrictrans: translatetransformsize (CV: mat const & SRC, CV: mat & DST, int dx, int Dy) {cv_assert (SRC. depth () = cv_8u); const int rows = SRC. rows + ABS (dy); // size of the output image const int Cols = SRC. cols + ABS (dx); DST. create (rows, cols, SRC. type (); vec3b * P; For (INT I = 0; I <rows; I ++) {P = DST. PTR <vec3b> (I); For (Int J = 0; j <Cols; j ++) {int x = J-DX; int y = I-dy; if (x> = 0 & Y> = 0 & x <SRC. cols & Y <SRC. rows) P [J] = SRC. PTR <vec3b> (y) [x] ;}}

PS: here, the code for image transformation takes a three-channel image as an example. The code for single-channel conversion is similar, but it is not processed in the code.

3. Image Transformation

There are two types of image transformations: horizontal image and vertical image. The horizontal image uses the vertical midline of the image as the axis to swap the pixels of the image, that is, the right half and the left half of the image. A vertical image uses the horizontal midline of the image as the axis to align the upper part of the image with the end part. The effect is as follows:

3.1 conversion Principle

Set the width and length of the image to height. (X, Y) is the transformed coordinate, (x0, y0) is the coordinate of the original image

Horizontal image Conversion
Forward ing
Its inverse transformation is
Backward ing
Vertical image Conversion

Its inverse transformation is

3.2 opencv-based implementation

Implementation of horizontal images

void GeometricTrans::hMirrorTrans(const Mat &src, Mat &dst){    CV_Assert(src.depth() == CV_8U);    dst.create(src.rows, src.cols, src.type());    int rows = src.rows;    int cols = src.cols;    switch (src.channels())    {    case 1:        const uchar *origal;        uchar *p;        for (int i = 0; i < rows; i++){            origal = src.ptr<uchar>(i);            p = dst.ptr<uchar>(i);            for (int j = 0; j < cols; j++){                p[j] = origal[cols - 1 - j];            }        }        break;    case 3:        const Vec3b *origal3;        Vec3b *p3;        for (int i = 0; i < rows; i++) {            origal3 = src.ptr<Vec3b>(i);            p3 = dst.ptr<Vec3b>(i);            for(int j = 0; j < cols; j++){                p3[j] = origal3[cols - 1 - j];            }        }        break;    default:        break;    }    }

Three-channel images and single-channel images are processed separately. Because the code is similar to the code later, only three-channel images are processed.

During Horizontal image transformation, the entire image is traversed and each pixel is processed based on the ing relationship. In fact, horizontal image transformation is to change the column of the image coordinate to the right, and the column on the right to the left, which can be converted by column as a unit. The same is true for vertical image transformations, which can be performed in units of behavior.

Vertical image Conversion

void GeometricTrans::vMirrorTrans(const Mat &src, Mat &dst){    CV_Assert(src.depth() == CV_8U);    dst.create(src.rows, src.cols, src.type());    int rows = src.rows;    for (int i = 0; i < rows; i++)        src.row(rows - i - 1).copyTo(dst.row(i));}

src.row(rows - i - 1).copyTo(dst.row(i));

The above line of code is the core code of the Transformation. line I is taken from the original image and copied to the target image.

It's too painful to write the theory part. The next several types of geometric transformations will continue tomorrow: transpose, scaling, rotation, and combination transformation.

Opencv2: geometric transformation of images, translation, mirroring, scaling, and rotation (1)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Opencv2: geometric transformation of images, translation, mirroring, scaling, and rotation (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Opencv2: geometric transformation of images, translation, mirroring, scaling, and rotation (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support