Linear interpolation algorithm for Image Scaling

Source: Internet
Author: User
What is bilinear interpolation?

For example
Original numerical sequence: 0, 10, 20, 30, 40
Linear interpolation: 0, 5, 10, 15, 20, 25, 30, 35, 40
That is, the change (increase or decrease) is linear, and a straight line can be drawn on the coordinate chart.
In digital camera technology, these values can represent the color and color of different pixels in a photo.

For ease of understanding, consider linear interpolation in one dimension.
For a series of C, we assume that the range from C [a] to C [A + 1] is linear.
Then for floating point X (A <= x <A + 1), C (x) = C [A + 1] * (X-) + C [a] * (1 + A-x );
Is that easy to understand?

This interpolation method is extended to two-dimensional situations.
For a two-dimensional array C, we assume that for any floating point I, C (A, I) to C (a + 1, I) is linearly changed, C (I, b) It is also linear between C (I, B + 1) (A and B are integers)
Then, if the coordinates (x, y) of floating point numbers meet the conditions (a <= x <A + 1, B <= Y <B + 1), we can first obtain the coordinates of C (x, y, b) and C (X, B + 1 ):
C (X, B) = C [A + 1] * (X-A) + C [a] * (1 + A-x );
C (X, B + 1) = C [A + 1] [B + 1] * (X-) + C [a] [B + 1] * (1 + A-x );
Well, now we know C (X, B) and C (X, B + 1), and C (X, B) to C (X, B + 1) it is also linear, so:
C (x, y) = C (X, B + 1) * (Y-B) + C (X, B) * (1 + B-y)
This is bilinear interpolation.

====== ~~ Split line sang ~~ ====== ~~ Split line sang ~~ ====== ~~ Split line sang ~~ =====

Principles

In the process of spatial transformation, the typical situation is that the image will be distorted when the image is amplified. This is because there are some pixel locations not included in the transformed image. To illustrate this problem, assume that there is a 64x64 grayscale image a, and now the image is enlarged to 256x256, as shown in Image B, 1. Obviously, according to the simple ry conversion relationship, we can know that the pixel values (x, y) in the B image correspond to (X/4, Y/4) in the image) the pixel value, that is
B (x, y) = a (x/4, Y/4) (type 1)
For (4, 4), (4, 8), (4, 16 )... (256,256) the position in a can be calculated through Formula 1 to obtain the gray value. However, )... If the coordinate points are calculated according to formula 1, the coordinates corresponding to them in expression A are no longer integers. For example, for the coordinate point (0.25) in B, the corresponding coordinate in a is changed to (0.25 ). For digital images, decimal coordinates are meaningless. Therefore, we must consider using some method to obtain the gray level of the pixel in B at the corresponding position in.
The method for processing this problem is called gray-level Interpolation of images. There are three common interpolation methods: Recent neighbor interpolation, bilinear interpolation, and double cubic Interpolation. Theoretically, the latest neighbor interpolation has the worst effect, and the double three interpolation has the best effect. The bilinear interpolation effect is between the two. However, for image interpolation that is not strictly required, bilinear interpolation is usually enough.
This article uses MATLAB to implement a bilinear interpolation program.
The principle 2 of bilinear interpolation is shown in. There are two ways to map coordinates between images: if the source image is mapped to the target image, it is called forward ing, and vice versa, it is called backward ing. Obviously, bilinear interpolation uses backward ing.
The following describes the specific meaning of Figure 2. First, obtain the coordinates (x/4, Y/4) in the image from the coordinates (x, y) in the B Image Based on the geometric relationship. However, the mapped coordinates (x/4, Y/4) are not exactly the integer coordinates in the image a, but are mapped to four pixel coordinates (A, B), (a + 1, B), (a, B + 1), (a + 1, B + 1) between the rectangle, where, A and B are the integer coordinates of the image. The question now is how a (a, B), A (a + 1, B), A (A, B + 1), a (a + 1, B + 1) Calculate the gray level of A (x/4, Y/4) at the four points. The method used by bilinear interpolation technology is: assuming that the gray-scale changes of image a are linearly changed in the vertical direction, this can be obtained based on the linear equation or geometric proportional relationship (, gray Level A (A, Y/4) and a (a + 1, Y/4) (A + 1, Y/4) at y/4 coordinates ). Then, assume that in (A, Y/4), a (A, Y/4), and (A + 1, Y/4), a (a + 1, y/4) the gray level remains linear on the line determined by the two points. After finding the linear equation, we can obtain the gray level A (x/4, Y/4) at (X/4, Y/4 ). This is the basic idea of bilinear interpolation. Two basic assumptions are used: first, the gray level changes linearly in the vertical direction, and then it is assumed that the gray level also changes linearly in the horizontal direction.

====== ~~ Split line sang ~~ ====== ~~ Split line sang ~~ ====== ~~ Split line sang ~~ =====

People who have done image programs in Windows should know that Windows's GDI has an API function: stretchblt, which corresponds to the stretchdraw method of the TCanvas class in VCL. It is easy to scale the image. But the problem is that it uses the fastest, simplest, and worst-performing "Nearest Neighbor Method". Although it is enough in most cases, however, this is not suitable for situations with high requirements.

Not long ago, I made a joke (see my album of Personal Information Assistant) to manage a bunch of photos I took with DC. One plug-in provided the zoom function, the current version uses stretchdraw, and sometimes the effect is unsatisfactory. I have always wanted to add two better methods: linear interpolation and cubic spline. After research, we found that the cubic spline method has a large amount of computing and is not very practical. Therefore, we decided to only use the linear interpolation method version.

From the basic theory of digital image processing, we can know that the deformation transformation of the image is the coordinate transformation from the source image to the target image. The simple idea is to convert the coordinates of each vertex of the source image into the new coordinates of the corresponding vertex of the target image through the deformation operation, but this will lead to a problem that the coordinates of the target vertex are usually not integers, in addition, the image amplification operation will cause the target image to be not mapped to the source image point, which is a disadvantage of the so-called "Forward ing" method. Therefore, the "reverse ing" method is generally used.

However, the reverse ing method also causes the problem that the coordinates mapped to the source image are not integers. Here we need a "re-sampling filter ". This term looks very professional, but it is actually because it borrowed the conventional saying in electronic signal processing (in most cases, its function is similar to the band-pass filter in electronic signal processing ), it is not complicated to understand, that is, how to determine the color of the point at the non-integer coordinate. The three methods mentioned above: the recent neighborhood method, linear interpolation method, and cubic spline method are called "subsampling filters ".

The so-called "neighborhood" refers to rounding the non-integer coordinate to obtain the color of the vertex at the coordinate of the nearest integer point. The linear interpolation method is used for linear interpolation based on the colors of the nearest points (four points for a flat image) (two-dimensional linear interpolation for a flat image) to estimate the color of this point, in most cases, its accuracy is higher than the neighborhood method, of course, the effect is also much better, the most obvious is in the amplification, the image's edges are much smaller than the nearest neighbor method. Of course, it also leads to a problem: that is, the image looks soft. In terms of professional terms, this filter is called: The Band resistance performance is good, but there is a band-pass loss, and the rectangular coefficient of the band-pass curve is not high. As for the cubic spline method, I will not talk about it. It is a little more complicated. You can refer to professional books on digital image processing, such as references in this article.

Let's discuss the algorithm of coordinate transformation. A simple spatial transformation can be expressed using a transformation matrix:

[X', y', W'] = [U, V, W] * t

Here: X' and y' are the coordinates of the target image, and U and V are the coordinates of the source image. W and W' are called homogeneous coordinates, which are usually set to 1, T is a 3x3 transformation matrix.

Although this representation method is mathematical, it can be used to easily represent different transformations, such as translation, rotation, scaling, and so on. For scaling, it is equivalent:

[Su 0 0]

[X, y, 1] = [U, V, 1] * [0 SV 0]
[0 0 1]

The Su and SV values are the zooming rate in the X and Y axes respectively. If the value is greater than 1, The zooming rate increases. If the value is greater than 0, the zooming rate decreases when the value is smaller than 1, and if the value is less than 0, the zooming Rate.

Does the matrix look dizzy? In fact, the above formula is expanded by matrix multiplication:

{X = u * su

{Y = V * SV

That's simple. Pai_^

With the above three preparations, you can start to write code. The idea is simple: first, traverse each vertex coordinate of the target image with two duplicates and use the transformation formula above (Note: Because reverse ing is used, the corresponding transformation formula should be: U = x/Su and V = y/SV) to obtain the source coordinate. Because the source coordinate is not an integer coordinate, two-dimensional linear interpolation is required:

P = N * B * pa + N * (1-B) * Pb + (1-N) * B * PC + (1-N) * (1-B) * PD

Where: N is V (the Y axis coordinate of the corresponding point after ing in the source image, which is generally not an integer), the Y axis coordinate of the closest row is similar to V; B is also similar, however, it is an X-axis coordinate. The PA-PD is the color of the source image points (with the pixels attribute of TCanvas) of the four nearest (upper left, upper right, lower left, and lower right) points around (u, v) points, respectively ). P is the interpolation color of (u, v) points, that is, the approximate color of (x, y) points.

I will not write this code because it is too inefficient: to perform the complex floating point operation on the RGB of each vertex of the target image. Therefore, optimization is required. For VCL applications, a simple optimization method is to use the scanline attribute of tbitmap for Row-based processing to avoid pixel-level operations of pixels, this can greatly improve the performance. This is already the basic optimization knowledge for image processing using VCL. However, this method does not always work. For example, more techniques are required for image rotation.

In any case, the overhead of floating point operations is much larger than that of integers, which must be optimized. As can be seen from the above, floating point numbers are introduced during the transformation, while the Su and SV parameters of the transformation are usually floating point numbers, so we can optimize them from it. Generally, Su and SV can be expressed as scores:

Su = (double) DW/SW; Sv = (double) DH/SH

DW and DH are the width and height of the target image, SW and sh are the width and height of the source image (because they are all integers, type conversion is required for obtaining floating point results ).

The new Su and SV are substituted into the preceding transformation and interpolation formulas to export the new interpolation formula:

Because:

B = 1-x * SW % DW/(double) dw; n = 1-y * sh % DH/(double) DH

Settings:

B = DW-x * SW % dw; n = DH-y * sh % DH

Then:

B = B/(double) dw; n = N/(double) DH

Use the integer B and N to replace the floating point B and N. The conversion interpolation formula is as follows:

P = (B * n * (Pa-Pb-PC + PD) + DW * n * Pb + DH * B * PC + (DW * DH-DH * B-DW * n) * PD)/(double) (DW * DH)

Here, the final result P is a floating point number, which is rounded to the result. To completely remove floating-point numbers, you can use this method to rounding up:

P = (B * n... * PD + DW * DH/2)/(DW * DH)

In this way, p is the integer value after rounding. All calculations are integer operations.

The code after simple optimization is as follows:

Int _ fastcall tresizedlg: stretch_linear (graphics: tbitmap * adest, graphics: tbitmap * asrc)

{

Int Sw = asrc-> width-1, SH = asrc-> height-1, DW = adest-> width-1, DH = adest-> height-1;

Int B, n, x, y;

Int npixelsize = getpixelsize (adest-> pixelformat );

Byte * plineprev, * plinenext;

Byte * pdest;

Byte * pA, * pb, * PC, * PD;

For (INT I = 0; I <= DH; ++ I)

{

Pdest = (byte *) adest-> scanline [I];

Y = I * sh/DH;

N = DH-I * sh % DH;

Plineprev = (byte *) asrc-> scanline [y ++];

Plinenext = (n = DH )? Plineprev: (byte *) asrc-> scanline [y];

For (Int J = 0; j <= dw; ++ J)

{

X = J * Sw/DW * npixelsize;

B = DW-J * SW % dw;

Pa = plineprev + X;

PB = pa + npixelsize;

PC = plinenext + X;

Pd = PC + npixelsize;

If (B = DW)

{

PB = PA;

Pd = pc;

}

For (int K = 0; k <npixelsize; ++ K)

* Pdest ++ = (byte) (INT )(

(B * n * (* pA ++-* pb-* PC + * PD) + DW * n ** Pb ++

+ DH * B ** PC ++ (DW * DH-DH * B-DW * n) ** PD ++

+ DW * DH/2)/(DW * DH)

);

}

}

Return 0;

}

It should be said that it is concise. Because the width and height are calculated from 0, one is required. getpixelsize determines the number of bytes of each pixel Based on the pixelformat attribute, this Code only supports 24 or 32-bit colors (for 15 or 16-bit colors, they need to be split by bit-because if they are not split, unexpected carry or bits will appear in the computation, cause image color confusion-difficult to process; you need to check the color palette for 8-bit and below-8-bit index colors, and re-indexing is required, which is also very troublesome, so it is not supported; but 8-bit grayscale images are supported ). In addition, some code is added to the Code to prevent cross-border access at the edge of the image.

By comparison, on the PIII-733 machine, when the target image is smaller than 1024x768, it is basically unable to feel the speed is obviously slower than the stretchdraw (it feels more obvious when using floating point ). The effect is also quite satisfactory, and the image quality is significantly improved by using the stretchdraw method, whether reduced or amplified.

However, due to the integer calculation, there is a problem that must be paid attention to, that is, overflow: Because the denominator in the formula is DW * DH, the result should be a byte, which is an 8-bit binary number. A signed integer can represent a maximum of 31-bit binary numbers. Therefore, the DW * DH value cannot exceed 23-bit binary numbers, that is, the resolution of the target image cannot exceed 4096*2048 Based on the aspect ratio. Of course, this can also be expanded by using unsigned numbers (one digit can be added) and reduced computing accuracy. If you are interested, you can try it on your own.

Of course, this code is far from optimized to the extreme, and there are still many problems that have not been studied in depth, such as anti-aliasing. If you are interested, you can refer to the relevant books and research on your own, if you have any research results, you are welcome to write plug-ins for my program.

[Mental Studio] birds of prey

2004-3-28

References:

Cui Yi, digital image processing technology and application, Electronic Industry Press, 1997

 

 

// The above can only be a 24-Bit Bitmap. Now it is modified in a Delphi version and supports multiple bitmap formats. Test. Both 32 and 24 bits are supported.

Procedure stretchbitmap (DEST, Src: tbitmap );
VaR
SW, sh, DW, DH, B, n, x, y, I, J, K, npixelsize: DWORD;
Plineprev, plinenext, pdest, Pa, Pb, PC, PD: pbyte;
Begin
SW: = SRC. Width-1;
SH: = SRC. Height-1;
DW: = DeST. Width-1;
DH: = DeST. Height-1;
// Obtain the Display Mode
Npixelsize: = INTEGER (SRC. pixelformat );
If npixelsize <4 then
Npixelsize: = 4
Else if npixelsize = 4 then
INC (npixelsize)
Else if npixelsize> 7 then
Npixelsize: = 7;
DeST. pixelformat: = tpixelformat (npixelsize );
Npixelsize: = npixelsize-3;
For I: = 0 to Dh do
Begin
Pdest: = DeST. scanline [I];
Y: = I * Sh Div DH;
N: = DH-I * Sh mod DH;
Plineprev: = SRC. scanline [y];
INC (y );
If n = DH then
Plinenext: = plineprev
Else
Plinenext: = SRC. scanline [y];
For J: = 0 to DW do
Begin
X: = J * SW Div DW * npixelsize;
B: = DW-J * SW mod dw;
PA: = plineprev;
INC (Pa, X );
Pb: = PA;
INC (Pb, npixelsize );
PC: = plinenext;
INC (PC, X );
PD: = pc;
INC (PD, npixelsize );
If B = DW then begin
Pb: = PA;
PD: = pc;
End;
For K: = 0 to npixelsize-1 do
Begin
Pdest ^: = byte (DWORD (B * n * DWORD (PA ^-Pb ^-PC ^ + Pd ^) + DW * n * pb ^
+ DH * B * PC ^ + (DW * DH-DH * B-DW * n) * PD ^
+ DW * DH Div 2) Div (DW * DH )));
INC (pdest );
INC (PA );
INC (PB );
INC (PC );
INC (PD );
End;
End;
End;
End;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.