Coincides with the training and a period of time using the edge of knowledge development projects, sorting out the relevant knowledge as training materials, a long time did not write bo, knowledge or need to comb.
I. The importance of the Edge
The importance of edge in image processing is self-evident. At present, the highest technology of AI is the deep learning, and the characteristics of deep learning modeling in image, many of which are from the edge as the starting point, continue to form a higher level of feature description. Let's take a look at the example, which is excerpted from Zouxy09 's article on Deep Learning (deep learning): A series of learning notes.
Around 1995, Bruno Olshausen and David Field Two scholars served as Cornell University, who tried to use both physiology and computer techniques to study visual problems.
They collected a lot of black and white scenery photos, from these photos, extract 400 small fragments, each photo fragment size is 16x16 pixels, you may want to mark these 400 fragments as s[i], i = 0,.. 399. Next, from these black and white landscape photos, randomly extract another fragment, the size is 16x16 pixels, you might want to mark this fragment as T.
The question they raised was how to pick a set of fragments from these 400 fragments, s[k], and, by superposition, synthesize a new fragment, and this new fragment should be as similar as possible to the randomly chosen target fragment, while the number of s[k] is as small as possible. To describe in a mathematical language is:
Sum_k (a[k] * s[k])--T, where A[k] is the weighting factor when stacking fragments s[k].
To address this problem, Bruno Olshausen and David Field invented an algorithm for sparse coding (Sparse Coding).
Sparse encoding is the process of repeating iterations, with each iteration divided into two steps:
1) Select a group of s[k] and adjust the a[k] so that sum_k (a[k] * s[k]) is closest to T.
2) Fix a[k], in 400 fragments, select other more appropriate fragments S ' [K], replace the original s[k], so that Sum_k (a[k] * S ' [K]) closest to T.
After several iterations, the best s[k] combination was selected. Surprisingly, the selected S[k] are basically the edge lines of different objects on the photo, these segments are similar in shape and differ in direction.
The algorithm results of Bruno Olshausen and David Field coincide with the physiological discoveries of David Hubel and Torsten Wiesel.
In other words, complex graphs are often made up of some basic structures. For example, a graph can be represented linearly by using 64 orthogonal edges (which can be understood as an orthogonal basic structure). For example, the X can be used in 1-64 edges three in accordance with the weight of 0.8,0.3,0.5. The other basic edge has no contribution, so they are all 0.
Figure 1
The above example simply illustrates the importance of edge features (not being pulled away). And then rip ...
ii. definition and type of edge
Definition: The edge is the dividing line of different regions and is a set of pixels that have a significant change in the surrounding (local) pixels, with a magnitude and direction of two attributes. This is not an absolute definition, mainly remembering that the edges are local features and the surrounding pixels change significantly to produce edges.
Tips: The relationship between contour and edge, it is generally considered that the contour is a description of the complete boundary of the object, and the edge points are connected together to form a contour. The edge can be an edge, and the contour is generally complete. Human eye visual characteristics, when looking at an object is generally the first to obtain the outline of the object information, and then get the details of the object, such as to see a few people standing there, we look at the past immediately can know that everyone is tall and thin, and then get face and clothing and other information.
Type: Simple is divided into 4 types, step type, ridge type, ramp type, pulse type, where the step type and the slope type is similar, but the change speed is different, the same, the ridge type and pulse type is also the same. See fig. 2, (a) and (b) may be considered as step or ramp type, (c) pulse type, (d) Ridge type, the difference between the step and the ridge is that the step rises or falls to a certain value after the continuation, and the roof is the first rise and then descend.
Figure 2 Edge Type
third, image edge Description
We are more concerned with the step and ridge edges, which are characterized by differential operators, as shown in Figure 3
Figure 3
In mathematics, the rate of change of the function is characterized by the derivative, the image we consider as a two-dimensional function, the above pixel value changes, of course, can also be used to characterize the derivative, of course, the image is discrete, then we change the pixel difference to achieve. For the step edge, figure 3 shows that its first derivative has a maximum value, the maximum point corresponds to the second derivative of the over 0 points, that is, the position of the exact edge is corresponding to the first derivative of the maximum point, or the second derivative of the over 0 points (note that the second derivative is not only the position of 0 value, two positive negative value transition Therefore, the type of edge detection operator has the first and second order differential operators.
four, edge detection operator category
Common edge detection operators: Roberts, Sobel, Prewitt, Laplacian, Log/marr, Canny, Kirsch, Nevitia
First Order differential operators: Roberts, Sobel, Prewitt
The Robert operator is the first edge detection operator, Lawrence Roberts in 1963.
Sobel edge operator, the author did not publish the paper in the year, only in a doctoral symposium (1968) presented ("A 3x3 isotropic Gradient Operator for Image processing"), Appeared and made public in footnotes to a monograph published in 1973 ("Pattern Classification and Scene analysis"). Proposed by Irwin Sobel.
The Prewitt operator comes from J.M.S. Prewitt "Object enhancement and Extraction" in "Picture Processing and Psychopictorics", Academic Press, 1970.
Let's look at these three kinds of edge detection operator templates and write the form of difference
Roberts operator
Sobel operator
Prewitt operator
Figure 41 Order Differential operators
How to calculate the edge amplitude and direction. Take the Sobel operator as an example. 3*3 Sobel Two-direction operators sliding on the image, the template and its covered image 3*3 region 9 pixels of convolution, summed to obtain the edge detection amplitude in this direction.
f (x, y) is an image, the GX and GY are the convolution results of the horizontal and vertical operators respectively, and G is the resulting edge amplitude, and the θ value is the edge direction. Of course, the calculation of G is sometimes simplified to
Or
There are a variety of options to find the amplitude, generally according to the specific application of the choice of horizontal or vertical or two directions at the same time detection.
In addition, it should be stated that there is also a variant of the Sobel operator, which is an isotropic Sobel operator whose template is
Figure 5 Isotropic Sobel operator
The weights of Sobel isotropic operators are more accurate than those of ordinary sobel operators. Why. The weight of the template is farther away from the center of the weight (see absolute value) the smaller the effect, such as above, the template as a 9 small square, small square side length of 1, then the hypotenuse of the dashed triangle is long, the right side of the edge length of 1, if the (0,0) position weights Absolute value of 1, then the distance relationship, 0) The absolute value of the weights at the place should be accurate.
second-order differential operators: Laplacian, Log/marr
The Laplace operator is derived from the Laplace transform, and the log operator, also known as the Marr operator, is proposed by David Courtnay Marr and Ellen Hildreth (1980), the founder of Computational Neuroscience Marr in 1980 when the official publication of the paper, because of leukemia premature death, Following the establishment of the Marr Award to commemorate its contribution, now every two years ICCV (and ECCV,CVPR and called the three top conference on computer vision) will be rated a Marr award. The two operator templates are as follows:
Laplacian operator (two types of templates)
Log operator
Figure 62 Order Differential operators
The mathematical formula of the Laplace operator is
Written in differential form for
Log edge detection is a Gaussian filter before the Laplace operator detection, and then found 0 points to determine the edge position, many times we just know the log 5*5 template as shown above, but how to get it. The following deduction.
The Ivigos formula is
The second derivative of x, y direction is obtained by the Laplace operator formula.
Here, X, Y cannot be viewed as a template location, and should be seen as the distance from the other location to the center location. Then write
Here x0,y0 is the template center location, x, Y is the template other locations, for the 5*5 template, then X0=2,y0 = 2, that for the template (0,0) The weight of the position, that is, x= 0,y= 0,x0= 2,y0 = 2 with the upper, and another = 1, get about equal to 0.0175, so get
The template shown in Figure 6 is obtained by taking an integer change symbol and the sum of the template is 0.
In addition, here is how the template size is taken. Usually the Gaussian distribution, in the range of ( -3,3) covers the vast majority of the area, so the template size is generally taken Dim = 1 + 6 (in the Sift feature, where the Gaussian blur is also taken), dim if it is a decimal, the smallest integer not less than dim, of course, the actual use is not so strict, As above we take = 1 o'clock, the template size to take 5*5. The weight adjustment in the same size template is the change to a certain extent, the size of the template changes (this is personal understanding, welcome to shoot Bricks).
non-differential edge detection operator: Canny
Canny edge detection you should be familiar with the steps listed here, and give a detailed description of the link canny operator.
1. Color image conversion to grayscale image
2. Gaussian blur of the image
3. Calculate the image gradient, calculate the image edge amplitude and angle according to the gradient (in fact, the differential edge detection operator is used to calculate the gradient amplitude direction)
4. Non-maximum signal suppression processing (edge thinning)
5. Double-threshold Edge Connection processing
6. Two value of image output results
direction operator Kirsch (8 3*3 templates), Nevitia (12 5*5 templates)
These two operators are calculated using sub-templates in multiple directions, and the last one with the largest amplitude is the final edge amplitude, which is the direction that the most significant value corresponds to.
Five, each edge detection operator comparison
OpenCV Direct invocation of Roberts and Prewitt edge detection