Image scaling with linear interpolation algorithm

Source: Internet
Author: User
Reference: Http://www.cnblogs.com/okaimee/archive/2010/08/18/1802573.html What is bilinear interpolation

Simple analogy
Original sequence of values: 0,10,20,30,40
Linear interpolation once is: 0,5,10,15,20,25,30,35,40
That is, the change (increase or decrease) is linear, you can draw a line on the coordinate chart
In digital camera technology, these values can represent the color, chroma, and other indicators of the different pixels that make up a single photo.

For ease of understanding, first consider linear interpolation in one-dimensional case
For a sequence C, we assume that c[a] is linearly variable between c[a+1]
So for floating-point x (A<=x<a+1), C (x) =c[a+1]* (x-a) +c[a]* (1+a-x);
That's a good understanding.

Extend this interpolation to a two-dimensional situation
For a two-dimensional array C, we assume that for any one floating-point i,c (A,i) to C (a+1,i) The linear change between C (I,b) to C (i,b+1) is also linear (a, B is an integer)
So for the coordinates (x, y) of the floating-point numbers (a<=x<a+1,b<=y<b+1), we can first calculate the C (x,b) and C (x,b+1) separately:
C (x,b) = c[a+1]* (x-a) +c[a]* (1+a-x);
C (x,b+1) = c[a+1][b+1]* (x-a) +c[a][b+1]* (1+a-x);
Well, C (x,b) and C (x,b+1) are now known, and according to the assumption that C (x,b) to C (x,b+1) is also linearly variable, so:
C (x, y) = C (x,b+1) * (y-b) +c (x,b) * (1+b-y)

This is the bilinear interpolation

principle Brief

In the process of spatial transformation of images, the typical situation is that when the image is enlarged, the image will appear distorted. This is because in the image after the transformation there are some pixel positions that are not in the image before the transformation. To illustrate this problem, suppose you have a grayscale image a of size 64x64, and now enlarge the image to 256x256 as Image B, as shown in Figure 1. Obviously, according to the simple geometric conversion relationship, you can know that the pixel value in the B image (x, y) should correspond to the pixel value at (X/4,Y/4) in the a image, i.e.
B (x, y) = A (X/4,Y/4) (Type 1)
For b (BIS), (4,8), (4,16) ... (256,256) These positions, through the Formula 1 can calculate its position in a, so that the gray value can be obtained. However, for B (1,3), (...) ... And so on, these coordinate points, if computed by Formula 1, then their corresponding coordinates in a are no longer integers. For example, for a coordinate point in B (0.25,0.25), its corresponding coordinate in a is changed to (X-y). For digital images, fractional coordinates are meaningless. Therefore, it is necessary to consider a method to get the gray level of the pixel point in B in the corresponding position in a.
The method of dealing with this problem is called image gray level interpolation. There are three kinds of interpolation methods commonly used: Nearest neighbor interpolation, bilinear interpolation, double three-time interpolation. In theory, the nearest neighbor interpolation has the worst effect, with a double three-time interpolation that works best, and the effect of bilinear interpolation is in between. However, bilinear interpolation is often sufficient for image interpolation that is not strictly required.
In this paper, we will implement a bilinear interpolation program using MATLAB.
The principle of bilinear interpolation is shown in Figure 2. There are two ways of mapping coordinates between images: If you are mapping from the coordinates of the original image to the target image, it is referred to as a forward mapping and vice versa. It is obvious that bilinear interpolation uses a back-mapping method.
The specific meanings of Figure 2 are described below. First, based on the relations, the coordinates (x, y) in the B image are obtained from the coordinates in the A image (X/4,Y/4), but the resulting coordinate (X/4,Y/4) is not exactly in the integer coordinates of the a image, but is mapped to four pixel coordinates (a, b), (A+1,B), (A, b+1), (a+1,b+1) between the rectangles, where a, B is an image of the integer coordinates. The question now is how to find the grayscale level at a (X/4,Y/4) based on the gray level on the four points of a (a, b), A (A+1,b), A (a,b+1), A (a+1,b+1). Bilinear interpolation technique adopts the method of assuming that the gray level of a image changes linearly in the longitudinal direction, so that the gray level A (A,Y/4) and A (A+1,Y/4) at (A,Y/4) and (A+1,Y/4) coordinates can be obtained according to the linear equation or geometric proportional relationship. Then, assuming that the line is determined by the two points ((A,Y/4), A (A,Y/4)) and (A+1,Y/4), A (A+1,Y/4)), the gray level is still linearly variable. The linear equation is obtained, so the gray level A (X/4,Y/4) at (X/4,Y/4) can be calculated. This is the basic idea of bilinear interpolation. The two basic assumptions used are: first, the gray level is linearly variable in the longitudinal direction, and then the gray level is assumed to be linearly variable in the transverse direction.

The person who has done the image aspect program in Windows should know that the GDI for Windows has an API function: StretchBlt, which corresponds to the Stretchdraw method of the Tcanvas class in the VCL. It is easy to implement the scaling operation of the image. But the problem is that it uses the fastest, simplest, and worst-case "nearest neighbor" method, although in most cases it is enough, but not for higher demands.

Not long ago I made a gadget (see "My album of People's Information assistants"), used to manage a bunch of photos I took with DC, which has a plugin that provides the zoom function, the current version is Stretchdraw, sometimes the effect is not satisfactory, I've always wanted to add two better: linear interpolation and three-time spline. After the study found that three times the calculation of the spline method is too large, not very practical, so decided to do only linear interpolation version of the method.

From the basic theory of digital image processing, we can know that the transformation of image is the coordinate transformation of the source image to the target image. The simple idea is to transform each point coordinate of the source image into a new coordinate of the corresponding point of the target image by transforming it, but this leads to the problem that the coordinates of the target point are usually not integers, and that the amplification will cause the target image not to be mapped to a point in the source image, which is a disadvantage of the so-called forward mapping method So the "inverse mapping" method is generally used.

However, the inverse mapping method also appears when mapping to source image coordinates is not an integer problem. A "resampling filter" is required here. The term looks professional, but it is simply because it borrows the idiomatic expression of electronic signal processing (in most cases, it functions like a bandpass filter in electronic signal processing) and is not as complicated as it is to determine what color the point at this non-integer coordinate should be. The three methods mentioned earlier: the nearest neighbor method, the linear interpolation method and the three spline method are all called "resampling filters".

The so-called nearest neighbor method is to round up this non-integer coordinate and take the color of the point at the nearest integer coordinate. and "linear interpolation" is based on the closest points around (for a flat image, a total of four points) of the color as a linear interpolation calculation (for a planar image is a two-dimensional linear interpolation) to estimate this color, in most cases, it is more accurate than the nearest neighbor method, of course, the effect is much better, The most obvious is that when zoomed in, the edges of the image are much smaller than the nearest neighbor method. Of course, it also has a problem with the industry: the image will appear more soft. This filter in professional terms (hehe, show off my professional ^_^) is called: The performance of the band resistance, but there is a band-pass loss, the rectangular coefficient of the curve is not high. As for the three-time spline method, I will not say, a little more complicated, you can refer to the Digital Image processing professional books, such as the reference document.

Let's discuss the algorithm of coordinate transformation. A simple spatial transformation can be represented by a transformation matrix:

[x ', y ', W ']=[u,v,w]*t

Where: X ', y ' is the target image coordinates, U,V is the source image coordinates, w,w ' is called homogeneous coordinates, usually set to 1, T is a 3x3 transformation matrix.

Although this method of representation is very mathematical, it is easy to express many different transformations, such as translation, rotation, zooming, etc., in this form. For scaling, it is equivalent to:

[Su 0 0]

[X, y, 1] = [U, V, 1] * [0 Sv 0]
[0 0 1]

Where the SU,SV is the X-axis direction and the Y-axis of the zoom rate, greater than 1 o'clock magnification, greater than 0 is less than 1 o'clock reduced, less than 0 when reversed.

The matrix is not looking faint. In fact, the above-matrix multiplication is:

{x = u * Su

{y = v * Sv

It's so simple. ^_^

With the preparation of the three above, you can begin to write code implementations. The idea is simple: first use a double loop to traverse each point coordinate of the target image, through the above transformation (note: Because it is a reverse mapping, the corresponding transformation should be: U = x/su and V = y/sv) to obtain the source coordinates. Because source coordinates are not integer coordinates, two-dimensional linear interpolation operations are required:

P = N*b*pa + N * (1–b) *PB + (1–n) * b * PC + (1–n) * (1–b) * PD

where n is v (the y-coordinate of the corresponding point in the source image after mapping, generally not an integer) the y-coordinate of the nearest line below the V difference; the same b is similar, but it is the X-axis coordinate. PA-PD is the color of the source image point (with the Pixels attribute of the Tcanvas), which is the closest four (top left, upper right, bottom left, bottom right) point around (u,v). P is the interpolated color of the (U,V) point, which is the approximate color of the (x, y) point.

I'm not going to write this code because it's too inefficient: to do a bunch of complex floating-point arithmetic on the RGB of each point of the target image. So be sure to optimize it. For VCL applications, there is a relatively simple optimization method is to use the Tbitmap Scanline property, row-based processing, you can avoid Pixels pixel-level operation, the performance can be greatly improved. This is the basic optimization of image processing with VCL. However, this method does not always work, such as when the image is rotated, more skills are needed.

In any case, the overhead of floating-point operations is much larger than integers, and this is something that must be optimized. As can be seen from the above, the floating-point number is introduced in the transformation, and the transformation parameter SU,SV is usually a floating-point number, so it is optimized from it. In general, SU,SV can represent the form of a number of components:

Su = (double) dw/sw; Sv = (double) dh/sh

Where Dw, Dh is the width and height of the target image, Sw, Sh is the width and height of the source image (because all integers, for floating-point results, type conversions are required).

By substituting the new Su, Sv into the previous transformation formula and the interpolation formula, you can derive a new interpolation formula:

Because:

b = 1–x * Sw% DW/(double) DW; n = 1–y * Sh% dh/(double) DH

Set

B = dw–x * Sw% Dw; N = dh–y * Sh% Dh

The

B = b/(double) Dw; n = n/(double) Dh

Use the integer b, n instead of the floating-point B, N, to convert the interpolation formula:

P = (B * n * (pa–pb–pc + PD) + Dw * N * PB + DH * B * PC + (DW * DH–DH * B–DW * N) * PD)/(double) (DW * DH)

Here the final result P is a floating-point number, rounding it to get the result. To completely eliminate floating-point numbers, you can use this method of rounding:

P = (B * N ... * PD + DW * DH/2)/(DW * Dh)

In this way, P is directly rounded to the integer value, and all the calculations are integer operations.

The simple optimization code is as follows:

int __fastcall tresizedlg::stretch_linear (graphics::tbitmap * adest, Graphics::tbitmap * aSrc)

{

int SW = asrc->width-1, sh = asrc->height-1, DW = adest->width-1, DH = adest->height-1;

int B, N, X, y;

int npixelsize = getpixelsize (Adest->pixelformat);

BYTE * Plineprev, *plinenext;

BYTE * PDEST;

BYTE * PA, *PB, *pc, *PD;

for (int i = 0; I <= dh; ++i)

{

PDest = (BYTE *) adest->scanline[i];

y = i * SH/DH;

N = dh-i * sh% dh;

Plineprev = (BYTE *) asrc->scanline[y++];

Plinenext = (N = = DH)? Plineprev: (BYTE *) asrc->scanline[y];

for (int j = 0; J <= DW; ++j)

{

x = J * SW/DW * npixelsize;

B = dw-j * SW% DW;

PA = Plineprev + x;

PB = PA + npixelsize;

PC = Plinenext + x;

PD = PC + npixelsize;

if (B = = DW)

{

PB = PA;

PD = PC;

}

for (int k = 0; k < npixelsize; ++k)

*pdest++ = (BYTE) (int) (

(B * n * (*pa++-*PB-*pc + *pd) + DW * n * *pb++

+ DH * B * *pc++ + (DW * DH-DH * B-DW * N) * *pd++

+ DW * DH/2)/(DW * DH)

);

}

}

return 0;

}

It should be said to be relatively concise. Because the width height is calculated from 0, so to subtract one, Getpixelsize is based on the PixelFormat property to determine how many bytes per pixel, this code only supports 24 or 32-bit color (for 15 or 16-bit color needs to be split-in the calculation of the words will not be disassembled Undesirable carry or borrow, resulting in image color confusion-processing is cumbersome; for 8-bit and 8-bit indexed colors, it is not supported to look up the palette and need to be re-indexed and cumbersome, but 8-bit grayscale images are supported. In addition, the code adds some code that prevents access from crossing the edges of the image.

By comparison, in the PIII-733 machine, the target image is less than 1024x768, the basic sense speed is significantly slower than the Stretchdraw (with floating point feeling more obvious). The effect is also quite satisfactory, whether it is reduced or enlarged, the image quality is significantly higher than the Stretchdraw method.

However, due to the use of integer operation, there is a problem must be paid attention to, that is the problem of overflow: because the denominator in the formula is DW * DH, and the result should be a Byte is a 8-bit binary number, signed integer maximum can represent 31-bit binary number, so DW * DH value can not exceed 23-bit binary number, That is, the target image resolution cannot exceed 4096*2048 by 2:1 aspect ratio. Of course, this is also possible by the use of unsigned number (can be added one) and reduce the computational accuracy and other methods to achieve expansion, interested friends can try their own.

Of course, this code is far from optimization to the extreme, and there are many problems not in-depth research, such as anti-aliasing (anti-aliasing), interested friends can refer to the relevant book research, if you have any research results, very welcome you to my program to write plug-in implementation.

[Mental Studio] Raptor

2004-3-28

Reference documents:

Yi Cui "Digital image processing technology and application" electronics Press, 1997

The above can only be 24 bits of the bitmap, now modified to a Delphi version, support a variety of bitmap format, should be. Test, 32 and 24 bits are available

Procedure Stretchbitmap (Dest, Src:tbitmap);
Var
SW, SH, DW, DH, B, N, x, Y, I, J, K, Npixelsize:dword;
Plineprev, Plinenext, PDest, PA, PB, PC, Pd:pbyte;
Begin
SW: = src.width-1;
SH: = src.height-1;
DW: = dest.width-1;
DH: = dest.height-1;
Get display Mode
Npixelsize: = Integer (Src.pixelformat);
If npixelsize < 4 Then
Npixelsize: = 4
else if npixelsize = 4 Then
Inc (Npixelsize)
else if npixelsize > 7 Then
Npixelsize: = 7;
Dest.pixelformat: = Tpixelformat (npixelsize);
Npixelsize: = nPixelSize-3;
For I: = 0 to DH do
Begin
PDest: = Dest.scanline[i];
Y: = i * sh div dh;
N: = dh-i * SH mod dh;
Plineprev: = Src.scanline[y];
Inc (y);
If N = DH Then
Plinenext: = Plineprev
Else
Plinenext: = Src.scanline[y];
For J: = 0 to DW do
Begin
x: = J * SW Div DW * NPIXELSIZE;
B: = dw-j * SW mod DW;
PA: = Plineprev;
INC (PA, x);
PB: = PA;
INC (PB, npixelsize);
PC: = Plinenext;
INC (PC, x);
PD: = PC;
INC (PD, npixelsize);
If B = DW THEN BEGIN
PB: = PA;
PD: = PC;
End
For k: = 0 to NPixelSize-1 do
Begin
pdest^: = Byte (DWORD ((B * n * DWORD (pa^-pb^-pc^ + pd^) + DW * n * pb^
+ DH * B * pc^ + (DW * DH-DH * B-DW * N) * pd^
+ DW * DH Div 2) div (DW * dh)));
INC (pDest);
INC (PA);
INC (PB);
INC (PC);
INC (PD);
End
End
End
End

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.