Contour detection and hierarchical image segmentation_ images processing

Source: Internet
Author: User

Most of this article comes from the following reference links, plus some of your own understanding of the source code and the paper. Write it down for later search. First, the principle of the paper

Algorithm route: GPB-->OWT-–>UCM

function of each part: GPB (Global Pb): Calculates the probability of each pixel as boundary, that is pixel weight; OWT oriented Transform) Converts the above GPB results to multiple closed regions, and UCM (ultrametric contour Map) Converts the regions set above to hierarchical tree.

There are many nouns here, such as: What is hierarchical tree. What is oriented watershed Transform. 1.1 GPB (Global probability of boundary)

GPB is the weighted sum of MPB and SPB.
What MPB is. What SPB is.

-Step1: Computing g (x,y,θ)
For each pixel, take it as the center, and make a circle:

With the diameter of the tilt angle of θ, the circle is divided into two regions, and for each region of the pixels, make their histogram, as follows:

Use histogram data to calculate its card-side distance:

The distance is g (x,y,θ), representing the gradient magnitude of the pixel (x,y) in the direction of θ;

-Step2: Computing MPB
The common PB algorithm, which breaks a picture into 4 different feature channels, is brightness, color A, color B, and texture channel, where the top three channels are based on the CIE color Space

The weight of each pixel is the weighted sum of the G (x,y,θ) values computed from these 4 channels.

For the common PB algorithm, the author proposes a Multiscale method, that is, MPB.

It is based on the original PB algorithm, at the same time using a number of circular diameter δ (the author uses three, [Δ/2, Δ, 2δ]), for each δ, calculate its g (x,y,θ), the final formula is as follows:

In the formula I represent the Channel,s representative scale.
It means that for each pixel, we calculate its MPB value for each of the feature channel under different diameters.
α represents the weight of each feature channel in each of the different diameter conditions, which is gradient ascent for F-measure, and the training set used is BSDs.

-Step3: Computing SPB
The author first makes a sparse symmetric affinity matrix W, where each element wij is calculated as follows:

I,j represents two pixels not exceeding the radius r (unit: pixel, the author sets the r=5 in the Code), p is any point on the two-pixel line segment, and finds the maximum weight of pixel on a two-pixel-connected segment. Rho is a constant, and the author code is set to ρ= 0.1.

The matrix w represents the similarity between the pixels, through the order:

Get the Matrix D, by:

The first n+1 eigenvector is computed and the n=16 is used by the author in the code.

Then, the author treats each eigenvector as a picture, and uses Gaussian directional derivative filters to do the convolution operation to obtain:

Thus the SPB formula is obtained:

The parameters are: the physical interpretation of eigenvector is considered to be mass-spring system.

-Step4: Computing GPB
Synthesize MPB and SPB, get GPB.

The beta parameter preceding the previous article has been explained, parameter γ is using the BSDs training set, through the
F-measure =
On the gradient ascent to get.

Finally, for the Sigmoid function transformation of the GPB value, so that its value is between 0-1, as the probability of the pixel as boundary, I call it pixel weight. 1.2 OWT (oriented watershed Transform)

For each pixel, substitute eight set angle θ∈[0, Pi], take its maximum value as the weight of the edge. The E (x,y,θ) is the GPB formula.

In this way, each pixel is given a value between 0 and 1, and its value size indicates the likelihood that the pixel is boundary.

Then use WT (watershed Transform) technology to convert the above input into a series of PO (regions) and Ko (arcs). As shown in figure:

In the figure, the red dot is its region minimal,arcs its boundary.

The original WT algorithm uses the average value of the pixels weight on the arc as its strength.

This method, however, causes some pixels of the weak arc to be in the vicinity of the strong arc, and in the calculation, the direction θ for the strong arc is chosen, so that the pixel value is larger and the intensity of the weak arc is correspondingly higher.

To calculate the value of an arc pixel, the eight-directional weights can be selected, and the previous option is to choose the maximum theta direction of E (x,y,θ) regardless of the circumstances, without considering the direction of the arc, which may result in elements that would otherwise be smaller, because the theta direction E (x,y,θ) is the largest value, which causes the computed value of the pixel to be too large

As shown in the figure below, there are many horizontal strong arcs in the middle of two stone heads:

The original image does not have these horizontal strong arc edges, which is unreasonable.

The author proposes the Owt, based on the original WT, to all the pixel on the arc, to select a reasonable direction θ, the calculation of E (x, y,θ), so as to adjust the strength value of arc, the following method:

Here, for all the pixels on the arc, to calculate the arc strength, then how to determine which pixels is on the arc. Use raw WT to compute the pixels which belong to the regions, which pixels belong to arcs, and then recalculate the weights for all pixels on the arc, calculated by: Select E (x, y,θ) along the arc direction to get E (x,y) And finally calculate the strength of each arc, that is, the average of the weight of all pixels on the arc.

The procedure is as follows: For each arc, the arc subdivide (split) into many segments, as shown in figure:
Calculates the orientation of each segment, using O (x,y) to indicate its orientation use the following formula to recalculate the GPB (E (x,y)) value of each pixel:
Recalculate the strength of each arc, which takes the average value of all pixels on the arc as the final weight of the arc weight.

Left is modified before, right is modified:

The previous section concludes by using four channel features (brightness, color A, color b and texture), three radius dimensions, calculating the weights of each pixel in eight directions, which indicates the probability size of the pixel as a boundary, That is, the greater the value of the point value in a direction, the greater the likelihood that the point is a boundary, the theta of the maximum of E (X,y,theta), the probability of GPB, that is, Pixel (x,y) as the boundary, and the resulting E (x,y) as input, Using WT to divide all pixels into PO (regions) and Ko (arcs), the weight of arcs in the WT algorithm is directly based on the mean value of all pixels weights on the arc, and now recalculates the value of E (X,y,o (x,y)) in the arc direction for each pixel on the arc, Then take the mean value of all pixels on the arc as the value of the arc.

Because, if not recalculated, directly to take the mean, there is a possibility of an arc, a small weight point, next to a larger weight point, the mean will make the weight of the smaller point of the larger weight. such as the horizontal line, the original value is small, because the intersection point along the red line in the direction of greater weight, if the intersection points along the red line in the direction of the value, then the value of the arc point on the average, the overall weight of the horizontal line is pulled big. After modification, the intersection point, the point along the red line direction of the larger value, along the horizontal direction of the weight is small, so the overall weight of the horizontal line becomes relatively more accurate.

Why do you want to compute the weights in eight directions for a pixel point? I think just want to know in which direction, the difference between the two sides of the pixel is the largest, that is, the edge of the direction of the problem, so the eight-direction calculation is for the second step, the use of owt.
WT, watershed algorithm, using E (x,y) as its input, all pixels are divided into po (regions) and Ko (arcs), the original arcs weight directly with all pixels on the arc of the weight of the mean value, now recalculate arc on each pixel in the arc direction of E ( X,y,o (X,y)), and then take the mean value of all pixels on the arc as the value of the arc. 1.3 UCM (ultrametric contour Map)

To segmentation the image at different levels of detail, the author uses the ultrametric contour Map (UCM).

The OWT algorithm has output the highest detail regions set, and next, the author makes a graph, as follows:

Where the PO is Regions,ko is Arcs,w (Ko) is the strength of the arc. The graph takes region as node and if two region are adjacent, the corresponding two node is connected, and the connection strength is W (Ko).

Next, set the dissimilarity between the 22 regions as the average strength of their common arc.

Using a graph based merging technique to measure the dissimilarity between 22 regions, regions in ascending order of dissimilarity, in turn dissimilarity small region merged, Until at last there was only one region, so the construction of hierarchical tree was completed.

The construction process of hierarchical tree, similar to the construction process of Huffman trees

In this tree, because each step of the spanning tree is to remove the dissimilarity smallest arc, merging two region, the height of a region element in the tree represents the strength value of the arc that is removed when the region is merged, namely:

H (R) = W (C)

So, you can get a matrix:

The element of the matrix represents the highest degree of detail in the segmentation, the dissimilarity between all regions 22, whose value is two region the minimum public-owned region's height

Decided.

The element value calculation formula is as follows:

Summary: Hierarchical tree
Hierarchical tree will owt results, with regions as the vertex, arc intensity as the weight, using graph of the merging technology, similar to the construction process of Huffman trees, each from the candidate set to select two smallest node merge, This paper is to select two regions with the smallest dissimilarity each time from candidate regions, that is, the two closest regions.
This paper sets the average strength of the dissimilarity between 22 regions as its common arc
If two regions are adjacent, their dissimilarity is the strength of the two regions common arc, or the height of its public area if two regions are not adjacent, as shown in the following illustration:

Set
The distance of D (r1,r) =avg (ARC1)%r1 and R is the average value of the two regions common arcs
D (R,R2) =avg (ARC2)
So, d (R1,R2) = max (d (R1,r), D (R,R2))
Because, assuming that the average value of ARC1 is less than arc2, i.e. D (R1,R) < D (R,R2), then R and R1 are merged first, and the regions between R3 and R3 is the average of R2, if the dissimilarity is called ARC2. Because the common arc of R3 and R2 is ARC2. So, the dissimilarity between R1 and R2 is as above.

The calculation of the above two formulas is still not very understanding ...

This is a ultrametric contour Map (UCM)

Therefore, different threshold K can be set to obtain the segmentation of different detail degree. 1.4 Summary

In the original method, the author mainly did these four aspects of innovation: in the contour detector part of the MPB link introduced the concept of Multiscale, proposed the MPB algorithm, can be regarded as a general PB algorithm of the enhanced version, the formula is as follows:

In the SPB segment of the Contour detector section, the eigenvector is Gaussian directional derivative the convolution operation, and the formula is as follows:

Owt (oriented watershed Transform) is proposed, and the boundary of the original algorithm, which is affected by strong pixel, is calculated again with the direction of the arc in which it belongs.

Combining Owt generated region sets into UCM (ultrametric contour Map) allows us to output different details of the image contour through threshold K. Second, part of the code description

If you need to split the effect chart, run the code in bsr/grouping, if you want to see the p-r graph, run the Bsr/bench code.

To understand the principle, it is best to download the source code.

There are folders in the grouping: data, interactive, LIB, and source; file EXAMPLE.M, run_bsds500.m
Data folder: Contains the source files and output files to be entered by the running program;
Lib folder:. M uses the Addpath directive to add a file in the directory to the working directory, with the following file as a C + + file that is compiled using MEX or the. m file to use.
Sources folder: Source files, some files in Lib are compiled using source files from sources.

The following is a brief description of the internal implementation of the interface function:
The annotation of an interface function is divided into the following sections:

/*compute BG Histogram Smoothing kernel*//*get_image*//*mirror border*//*convert to
grayscale*//
* Gamma correct*/
/*convert to lab*/
/*quantize color channels*/
/*compute texton filter set*/
/*compute textons*/
/*return textons*//*compute bg at each radius*//*compute CGA at each
radius*/
/*compute CGB at each radius*/
/*compute TG at each radius*/
/*return textons*/

The output of each corresponding program is:

Lib_image a function under a namespace, such as
Lib_image::grayscale ()
The Lib_image class is defined in the following:
In BSR/GROUPING/SOURCE/GPB_SRC/INCLUDE/MATH/LIBRARIES/LIB_IMAGE.HH;
The corresponding function is implemented in the
In bsr/grouping/source/gpb_src/src/math/libraries/lib_image.cc.

Some of the code implementation is as follows, the code layout is easier to understand, the author of the relevant functions of the implementation of the focus together, and a more detailed description.

/*image processing functions*/class lib_image{public:/*********************************************** **image color space transforms.**-----------------------------Input RGB images should is scaled so r
    Ange of possible values for each color channel is [0,1]. //*compute a grayscale image from a RGB image*/static matrix&lt ;> Grayscale (const matrix<>,/*r*/const matrix<>,/*g*/const matrix&
    Lt;> &/*b*/); /*normalize a grayscale image so this intensity values lie in [0,1]*/static void Grayscale_normalize (Matrix<> &
    amp;); /*normalize a grayscale image so this intensity values span the full [0,1] range*/static void Grayscale_normalize_str

    Etch (matrix<> &); /*gamma Correct the RGB image using the given correction value*/static void Rgb_gamma_correct (matrix<>/*r*/Matrix<>&,/*g*/Matrix<>&,/*b*/double/*gamma*/); /*normalize an Lab image, so, for each channel lie in [0,1]*/static void Lab_normalize (matrix<&

    gt;&,/*l*/matrix<>&,/*a*/matrix<>&/*b*/); /*convert from RGB color spaces to XYZ color space*/static void rgb_to_xyz (Matrix<>&,/*r (input)- -> x (Output) */Matrix<>&amp,/*g (input)--> y (Output)/matrix<>&/*b (Input)--& Gt

    Z (Output) */); /*convert from RGB color spaces to Lab color space*/static void Rgb_to_lab (Matrix<>&,/*r (input)- -> L (Output) */Matrix<>&amp,/*g (input)--> A (output)/matrix<>&/*b (Input)--& Gt
    B (Output) */);
   ... * * * * * * * used several functions to parse. * * * * * */*********************************************** **gaussian kernels.**       -----------------------------the kernels are evaluated at an integer coordinates in the range[-s,s] (in the 1D CA
    SE) or [-s_x,s_x]*[-s_y,s_y] (in the 2D case), where S is the specified support.
    ///one-dimensional case/* The length of the returned vector is 2*support + 1
    The support defaults to 3*sigma.
    The kernel is normalized to have unit L1 norm.
        * * Static matrix<> Gaussian (double = 1,/*sigma*/unsigned int =0,/*derivative (0,1,2) * *
    BOOL = False/*take Hilbert transform?*/); Static matrix<> Gaussian (double,/*sigma*/unsigned int,/*derivative (0,1,2) */bool,/*ta
    Ke Hilbert transform?*/unsigned long/*support*/); Two-dimensional case static matrix gaussian_2d (double =1,/*sigma x*/double =1,/*sigma y*/double = 0, * orientation*/unsigned int =0,/*derivation in Y-direction (0,1 or 2)/bOol =false/*take Hilbert transform in y-direction?*/);
        Static matrix gaussian_2d (double,/*sigma x*/double,/*sigma y*/double,/*orientation*/ unsigned int; /*derivation in Y-direction (0,1 or 2) */bool,/*take Hilbert transform in y-direction?*/unsigned long,/
    *x support*/unsigned long/*y support*/);
    /*********************************************** quantize image values into the uniformly spaced bins in [0,1].
    Return the assignments and (optionally) bin Centroids. /static matrix<unsigned long> quantize_values (const MA
    Trix<>&amp,/*image*/unsigned long/* Number of bins * *); /*********************************************** **difference of Histogram (2D). * *---------------------- -------

    ************************************************/
}
auto_collection< matrix<&gt, array_list< matrix<> > > lib_image::hist_gradient_2d (const matrix&                                lt;unsigned long>& labels, unsigned long r, unsigned long N_ori, const matrix<>& smoothing_kernel, const DIS tanceable_functor<matrix<>,double>& f_dist) {/* construct weight matrix for circular disc/matrix
   <> weights = Weight_matrix_disc (R); /* Compute oriented gradient histograms/return lib_image::hist_gradient_2d (labels, weights, N_ori, Smoothing_
Kernel, f_dist);
 /* * Construct weight matrix for circular disc of the given radius.
   * * matrix<> Weight_matrix_disc (unsigned long R) {/* Initialize matrix/unsigned long size = 2*r + 1;
   Matrix<> weights (size, size);
   /* Set values in disc to 1 */long radius = static_cast<long> (r); Long r_sq = RadiUS * RADIUS;
   unsigned long ind = 0;
      for (long x =-radius x <= radius; x +) {Long x_sq = x * x;
         for (Long y =-radius y <= radius; y++) {/* check if index is within disc/long y_sq = y * y;
         if ((x_sq + y_sq) <= r_sq) Weights[ind] = 1;
      /* Increment Linear index * * ind++;
} return weights;
 }
Reference documents

1, the author publishes articles and resources download
Http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html
2, contour detection and hierarchical Image Segmentation source code compilation run
http://blog.csdn.net/blitzskies/article/details/19686179
These include the connection, the test of the bench, that is, the p-r curve.
3, contour detection and hierarchical image segmentation Berkeley paper Understanding and Learning
http://blog.csdn.net/alex_luodazhi/article/details/47337327

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.