Optical flow is a description of the motion information of image brightness. The optical flow method is initially proposed by Horn and Schunck in the 1981 , and the two-dimensional velocity field is creatively connected with the gray scale , which is introduced into the optical flow constrained equation to obtain the basic algorithm of the optical flow calculation. The optical flow calculation based on the motion of the object provides 2 hypotheses:
The gray scale of the ① moving object remains unchanged in a very short interval of time;
② velocity vector field changes in a given neighborhood are slow.
Algorithm principle
Assuming that the image has a pixel point (x, y), the brightness at the T moment is E (x+δx,y+δy,t+δt), and U (x,y0 and V (x, y) are used to represent the moving component of the point light flow in both horizontal and vertical directions:
U=dx/dt
V=dy/dt
After a period of time interval Δt the point corresponds to the brightness of E (x+δx,y+δy,t+δt), when the Δt is very small approaching 0 o'clock, we can think that the point brightness is unchanged, so there can be:
E (x,y,t) =e (X+ΔX,Y+ΔY,T+ΔT)
When the brightness of the point changes, the brightness of the post-move point is shipowner by the Taylor formula, which can be:
Ignoring its second-order infinitesimal, as the Δt approaches 0 o'clock, there are:
W= (U,V), so the upper equation is the basic optical flow constraint equations.
The gradient that represents the grayscale of the pixel point in the image along the x,y,t direction can be rewritten as:
Lucas-kanade Improved AlgorithmJean-yves Bouguet proposes an improved Lucas-kanade algorithm for affine transformations based on pyramid layering. Suppose I and J are two grayscale images of 2D, and the grayscale value for each pixel on the image is defined as:I (×) =i (x, y) and J (x) =j (y )where x= (x, y) is the image coordinate of the pixel on the image.
In the actual scene image I and Image J can represent the front and back two frames of the image. For the image feature Point pyramid Trace, the purpose is: for the previous frame of the image I a point U (ux,uy), to find a point on the back of the image J V (Ux+dx,uy+dy) to match, that is, the nearest gray value. Then Vector D=[dx,dy] is the speed at which the image moves at the point u, that is, the optical flow of the pixel point U. To further illustrate the meaning of vector d. We assume that the previous frame image undergoes an affine transformation to the next frame, defining the transformation matrix as
Four of these parameters Dxx,dyy,dxy,dyx characterize the affine deformation in the image. So the purpose of the optical flow calculation is to find the vector D and the transformation matrix A to minimize the gray difference in the area of the image.
Defining errors
Two of the integers WX and WY Set the size of the rectangle window on the image (2*wx+1) and (2*wy+1).
Typicalthe WX and WY values are 7,8,10,20 pixels. for Lucas-kanadeto improve the algorithm, the main step is three steps: Building the pyramid, based on the goldWord tower tracking, iterative process.
the establishment of pyramids
I0 = I is the No. 0 level image, which is the highest resolution image in the pyramid image, and the width and height of the image are defined as Nx0 = NX and ny0 = NY respectively. Build pyramids in a recursive way: Calculate I1 from I0, calculate I2 from I1, ... Make L =1, 2,... Represents the number of layers of the pyramid, and L usually take 2,3,4. Il?1 is the image of the l?1 layer, nxl?1 and nyl?1 are the width and height of the image il?1 respectively. The image il can be obtained by il?1 in the following way:
The IL-1 is convolution using a low-pass filter [0.25 0.5 0.25].
Pyramid Tracking
In general, the pyramid feature tracking algorithm is described as follows: first, the optical flow and affine transformation matrices are computed on the highest layer of images; The results of the previous layer are passed as the initial values to the next layer of the image. On the basis of the initial value, the optical flow and affine change matrix of this layer are computed, and then the optical flow and affine matrix of this layer are transferred to the next layer image as the initial value, until it is passed to the last layer, that is, the original image layer, the computed optical flow and affine transformation matrix as the result of the final optical flow and affine transformation matrix.
For l=0,1,2,... L, is defined as the coordinates of the pixel point u in the image in the corresponding point of the L layer. Based on the definition of the image pyramid in the previous step, you can calculate
We use mathematical ideas to re-describe the iterative operations in the L-and l+1-layers, assuming that the initial values for the top-level optical flow are calculated and that the transformation matrix at the top is guessed
In order to compute the optical flow and affine transformation matrices on the L-layer, it is necessary to redefine
The matching error on the L layer ξl:
Where the image and the original image is sampled on the L layer of the image, based on the optical flow in this layer and the affine matrix initial value GL and GL can be calculated two corresponding images and:
By observing the matching error, it is possible to change the window size for feature point tracking when the matching error is relatively small. The next step is to calculate the optical stream DL and transformation matrix Al on that layer, which we'll talk about in the next steps. Now, it is assumed that the optical flow and transformation matrices on this layer have been computed. The result is then passed to the next layer, which calculates the hypothetical initial value of the next layer:
GL-1 and GL-1 as the initial value, re-cycle the above steps until the top level, the optical flow D and affine transformation matrix A is calculated. When the top layer is set, the initial is:
The most obvious advantage of this algorithm is that the optical flow of each layer will remain small, but the final computed optical flow can accumulate, which makes it easy to track feature points efficiently.
Iterative Process
This step is the core step of the algorithm. At each level of the pyramid, the goal is to calculate the optical stream DL and affine transformation matrix Al so as to minimize the error ξl. Since the iterative process for each layer is the same, we describe the iterative process from one layer to the next. First, the light flow from the previous layer is passed to this layer, the light of the pixel in the image is computed, and the bias of the image in the X and y directions of the point is computed.
Ix=[i (X+1,y)-I (X-1,y)]/2
Iy=[i (x,y+1)-I (x,y-1)]/2
On this basis, the spatial gradient matrix is computed:
Update Optical Flow V=2*v
Iterative process: Calculates the grayscale of the corresponding pixel in the next frame of image, calculates two
The difference between the gray values of the same location points in the frame image, and the error between the computed images
Vector:
Final calculation for affine optical flow
,
Update trace Results
Until a certain threshold, end the iterative process in this layer.
feature Point selection
Therefore, you can select a feature point by following these steps:
1, calculates the matrix G and the minimum eigenvalue λm of each pixel in the image I.
2. Find the maximum eigenvalue λmax in the λm of the smallest eigenvalue in the whole pair of images.
3. Keep the minimum eigenvalue λm pixels greater than the given threshold. Thresholds usually take 5%λmax ~10%λmax.
4. A pixel that retains the λm local maximum: The pixel eigenvalue λm is greater than the eigenvalues λm of other pixels in its 3*3 neighborhood.
5. Remove some pixels in the pixel-dense area to ensure that the distance between neighboring pixels in the image is greater than the given threshold value (5~10 pixels).
When the above operation is complete, the remaining pixels in image I are selected feature points and are used as tracking feature points. Step 5 of the feature point selection algorithm ensures the minimum distance between feature points.
It is not necessary to take a large integrated window to select feature points (or compute matrix g). A large number of experiments have proved that WX = WY =1 3*3-sized integrated window can achieve satisfactory results.
Algorithmic flow
%%% Project Title:lucas kanade Motion estimation Using pyramids (Level 4)%%% Note:the Project specification says use Density of ten in plotting%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% usage:lucas_kanade (' 1.bmp ', ' 2.bmp ', ten)%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function Lucas_kanade (file1,file2,density); Percent Read Images percent IMG1 = im2double (Imread (file1)); Percent take alternating rows and columns percent [ODD1, even1] = Split (IMG1); Img2 = im2double (Imread (file2)); [Odd2, Even2] = Split (IMG2); Percent Run Lucas kanade percent [Dx, Dy] = Estimate (ODD1, ODD2); Percent Plot percent figure; [Maxi,maxj]=size (DX); DX=DX (1:DENSITY:MAXI,1:DENSITY:MAXJ); Dy=dy (1:DENSITY:MAXI,1:DENSITY:MAXJ); Quiver (1:DENSITY:MAXJ, (MaxI):(-density): 1,dx,-dy,1); Axis Square; Percent Run Lucas Kanade on all levels and interpolate percent function [Dx, Dy] = Estimate (IMG1, img2) level = 4; half_window_size=2; [m, n] = size (IMG1); G00 = IMG1; G10 = Img2; if (level>0) G01 = reduce (G00); G11= reduce (G10); End if (level>1) G02 = reduce (G01); G12 = reduce (G11); End if (level>2) G03 = reduce (G02); G13 = reduce (G12); End if (level>3) G04 = reduce (G03); G14 = reduce (G13); End l = level; For i=level:-1:0, if (l = = level) switch (l) Case 4, Dx = zeros (Size (G04)); Dy = zeros (Size (G04)); Case 3, Dx = zeros (Size (G03)); Dy = zeros (Size (G03)); Case 2, Dx = zeros (Size (G02)); Dy = zeros (Size (G02)); Case 1, Dx = zeros (Size (G01)); Dy = zeros (Size (G01)); Case 0, Dx = zeros (Size (G00)); Dy = zeros (Size (G00)); End Else Dx = expand (dx); dy = expand (dy); DX = dx. * 2; dy = dy. * 2; End switch (l) Case 4, W = Warp (G04, Dx, Dy); [Vx, Vy] = Estimatemotion (W, G14, half_window_size); Case 3, W = Warp (G03, Dx, Dy); [Vx, Vy] = Estimatemotion (W, G13, half_window_size); Case 2, W = Warp (G02, Dx, Dy); [Vx, Vy] = Estimatemotion (W, G12, half_window_size); Case 1, W = Warp (G01, Dx, Dy); [Vx, Vy] = Estimatemotion (W, G11, half_window_size); Case 0, W = Warp (G00, Dx, Dy); [Vx, Vy] = Estimatemotion (W, G10, half_window_size); End [M, n] = size (W); DX (1:m, 1:n) = DX (1:m,1:n) + Vx; Dy (1:m, 1:n) = Dy (1:m, 1:n) + Vy; Smooth (Dx); Smooth (Dy); L = l-1; End percent Lucas Kanade on the image sequence at pyramid step percent function [Vx, Vy] = Estimatemotion (W, G1, half_window_size ) [M, n] = size (W); Vx = zeros (Size (W)); Vy = zeros (Size (W)); N = Zeros (2*half_window_size+1, 5); For i = 1:m, L = 0; For j = 1-half_window_size:1+half_window_size, L = l + 1; N (l,:) = Getslice (W, G1, I, J, Half_window_size); End replace = 1; For j = 1:n, t = SUM (n); [V, d] = Eig ([t (1) t (2); t (2) T (3)]); To get the eigenvector and eigenvalues namda1 = d (n); Namda2 = d (2,2); if (namda1 > Namda2) tmp = NAMDA1; NAMDA1 = Namda2; NAMDA2 = tmp; TMP1 = V (:, 1); V (:, 1) = V (:, 2); V (:, 2) = TMP1; End if (namda2 < 0.001) Vx (i, j) = 0; Vy (i, j) = 0; ElseIf (Namda2 > namda1) n2 = V (ON) * t (4) + V (2,2) * t (5); Vx (I,J) = N2 * V (UP)/namda2; Vy (I,J) = N2 * V (2,2)/namda2; else N1 = V (for all) * t (4) + V (2,1) * t (5); N2 = V (ON) * t (4) + V (2,2) * t (5); Vx (I,J) = N1 * V (UP)/namda1 + N2 * V (UP)/namda2; Vy (I,J) = N1 * V (2,1)/namda1 + N2 * V (2,2)/namda2; End N (replace,:) = Getslice (W, G1, I, j+half_window_size+1, half_window_size); Replace = replace + 1; if (replace = = 2 * half_window_size + 2) Replace = 1; End end, percent of the Reduce function for pyramid percent function result = Reduce (ori) [m,n] = size (ori); Mid = zeros (m, n); M1 = round (M/2); N1 = Round (n/2); result = Zeros (m1, N1); W = generatefilter (0.4); For j=1:m, tmp = CONV ([Ori (J,N-1:N) Ori (j,1:n) Ori (J,1:2)], W); Mid (J,1:N1) = tmp (5:2:N+4); End for I=1:N1, tmp = CONV ([Mid (M-1:m,i); Mid (1:m,i); Mid (1:2,i)] ', W); Result (1:m1,i) = tmp (5:2:m+4) '; End percent of the Expansion function for pyramid percent Function result = expand (ORI) [m,n] = size (ori); Mid = zeros (m, n); m1 = m * 2; N1 = n * 2; result = Zeros (m1, N1); W = generatefilter (0.4); For j=1:m, t = zeros (1, N1); T (1:2:n1-1) = Ori (J,1:N); TMP = CONV ([Ori (j,n) 0 T Ori (j,1) 0], W); Mid (J,1:N1) = 2. * TMP (5:N1+4); End for i=1:n1, t = zeros (1, M1); T (1:2:m1-1) = Mid (1:m,i) '; TMP = CONV ([Mid (m,i) 0 T Mid (1,i) 0], W); Result (1:m1,i) = 2. * TMP (5:M1+4) '; End Function filter = Generatefilter (alpha) percent yield convolution kernel filter = [0.25-ALPHA/2 0.25 alpha 0.25 0.25-ALPHA/2]; function [n] = Getslice (W, G1, I, J, half_window_size) N = Zeros (1, 5); [m, n] = size (W); For Y=-half_window_size:half_window_sizE, Y1 = y +i; if (Y1 < 1) Y1 = Y1 + M; ElseIf (Y1 > m) Y1 = y1-m; End X1 = j; if (X1 < 1) X1 = X1 + N; ElseIf (X1 > N) X1 = x1-n; End derix = derivative (G1, X1, Y1, ' X '); Deriy = derivative (G1, X1, Y1, ' Y '); n = n + [Derix * Derix, ... Derix * Deriy, ... Deriy * Deriy, ... Derix * (G1 (Y1, X1)-W (Y1, X1)), ... Deriy * (G1 (Y1, X1)-W (Y1, X1))]; End Function result = Smooth (IMG) result = expand (Reduce (IMG)); function [odd, even] = Split (IMG); [m, n] = Size (IMG); Odd = img (1:2:m,:); even = img (2:2:m,:); Percent interpolation percent function result = Warp (img, Dx, Dy) [M, n] = Size (IMG); [x, Y] = Meshgrid (1:n, 1:m); x = x + Dx (1:m, 1:n); y = y + Dy (1:m,1:n); For I=1:m, for J=1:n, if X (i,j) >n x (i,j) = n; End If X (I,J) <1 x (i,j) = 1; End If Y (i,j) >m y (i,j) = m; EnD if Y (i,j) <1 y (i,j) = 1; End End End result = Interp2 (img, x, y, ' linear '); Percent-of-bilinear interpolation percent calculates the Fx Fy percent function result = Derivative (img, x, y, direction) [M, n] = Size (IMG); switch (direction) case ' X ', if (x = = 1) result = img (y, x+1)-img (y, x); ElseIf (x = = N) result = img (y, x)-img (y, x-1); else result = 0.5 * (img (y, x+1)-img (y, x-1)); End Case ' Y ', if (y = = 1) result = IMG (y+1, x)-img (y, x); ElseIf (y = = m) result = IMG (y, x)-img (y-1, x); else result = 0.5 * (IMG (y+1, x)-img (y-1, x)); End End
Optical Flow Method Optical Flow