Directionality descriptor for color and edge (color and Edge directivity descriptor,cedd)
This article is an excerpt from the research of Android mobile phone image classification technology.
Cedd has the advantage of extracting features faster and less space for feature descriptors. The following is a detailed exposition and analysis of the principle of cedd.
1. Color information
Cedd features combine color and texture information, this summary will give the process of color information extraction, focusing on the RGB-HSV model conversion, 10-bins fuzzy Filter and 24-bins fuzzy filter principle.
1.1.RGB model converted to HSV model
The RGB model can be said to be the most familiar and most used color models, which represent the three components that make up a color, (0,0,0) for Black, (255,255,255) for White, (255,0,0) for Red, (0,255,0) for Green, (0, 0,255) represents blue, other colors can be expressed by adjusting these three components. The design of the RGB color model is based on the principle of color luminescence, and hardware-related, in general, the computer will use this space model to display a color on the screen definition, that is, people are familiar with the three-color combination. Therefore, when extracting pixels from an image, the first extraction is usually the RGB information of the pixel.
In the HSV model, H (hue) represents the hue, which refers to the color that travels through or from an object, and is typically identified by a color name in use. S (saturation) represents the saturation, which represents the proportion of gray components in the hue, which refers to the purity or intensity of the color. V (Value) represents brightness, which refers to the degree of shading relative to the color. HSV model can better reflect people's ability to perceive and discriminate color, so it is very suitable for comparing color-based image similarity, and has been widely used in image classification.
Combining the above two points, it is necessary to rgb-hsv the image pixels before extracting the color information. In this feature extraction algorithm, the method of RGB-HSV conversion is slightly different, and the final range of S and V is also different, are (0,255), but the basic principle is not changed, this is to facilitate the subsequent operation in the fuzzy filter, the conversion formula is as follows:
All HSV values here are finally taken as integers.
By the above calculation, the HSV value of the pixel can be obtained, and the following will use the HSV value to filter the color information histogram.
1.2.10-bins Blur Filter
The working process of the 10-bins fuzzy filter is to enter the HSV information through three channels and then output 10 fuzzy histogram information values. The meanings of the 10 histogram information values are as follows: (0) Dark (Black), (1) Gray (Gray), (2) white, (3) Red, (4) orange (Oranges), (5) yellow (Yellow), (6) Green, (7) Cyan (Cyan), (8) blue color (blue), (9) Magenta (Magenta). Its principle.
10-bins Fuzzy filter is based on fuzzy theory, we first analyze the formation of color radial edge in fuzzy theory. Since H is a hue, it can be seen from its calculation that the value range of H is 0-360, then when a picture appears by one color to another color transition, h value changes will be faster, then there will be a so-called color radial edge. The position of these radial edges can be found based on the fuzzy theory. , figure (a) is an image of the extracted H-channel value, and the figure (b) is obtained by blurring the figure (a) through the CLF filter. CLF's English is all called Coordinatelogic filters, its method is the image on each 3*3 square nine pixels of the binary value of the logical "and" operation, so that at the edge of the H-channel color will appear the small H value, that is, we see the effect of the figure (b). The difference between the original H-value image and the filtered H-image can be followed by the more obvious radial edge of the color shown in (c). Figure (d) is the theoretical radial edge position of the H-channel.
The range of H radial edges can be obtained through repeated experiments of the above principles, and the value of H channel is divided into eight fuzzy regions, each of which is named: (0) Red-orange (redto orange), (1) Orange (orange), (2) yellow (Yellow), (3) Green, (4) Cyan (Cyan), (5) Blue, (6) Magenta (Magenta), (7) Blue-Red (Blueto red). Each of the two adjacent areas has a cross section.
3.1.3.24-bins Fuzzy Filter
24-bins Fuzzy filter is the 10-bins fuzzy filter output of each color area is divided into 3 h-value areas, the input of a 10-dimensional vector and S, v channel values, the output is a 24-dimensional vector, the system model 3-7 is shown. The information it represents for each dimension is: (0) Black, (1) gray (grey), (2) white, (3) dark red (Dark Red), (4) Red, (5) Light red (light red), (6) Dark orange (Darkorange), (7) Orange (orange), (8) Light orange, (9) Dark yellow (Dark Yellow), (10) yellow (Yellow), (11) Light yellow (Lightyellow), (12) Dark green (Dark green), (13) greenish (green), (14) Light green, (15) Dark Cyan (Dark Cyan), (16) Cyan (Cyan), (17) pale Cyan (light Cyan), (18) Dark blue (Dark blue), (19) Blue (blue), (20) light blue (LightBlue), (21) Dark Magenta (Darkmagenta), (22) Magenta (Magenta), (23) Magenta (light Magenta).
3.2. Texture information
This summary will introduce the extracting process of texture information in Cedd feature, calculate the gray value of pixel by Yiq model, and then extract the texture information of edge direction histogram of image.
3.2.1.YIQ Color Space
The Yiq color space belongs to the NTSC (International Television Standards Committee) system. Y (Luminace) represents the apparent degree of color, the intuitive point is the grayscale value of the image. I and Q (chrominace) represent the hue information, which describes the color of the image and the properties of the saturation respectively. In the Yiq color space model, the Y component represents the luminance information of the image, the I and Q components represent the color information, the I component refers from Orange to cyan, and the Q component refers from purple to yellowish green [24].
Through the conversion of color image from RGB to Yiq space, the luminance information and chroma information in color image can be separated and processed independently. The corresponding relationship of the RGB conversion to the YIQ space model is shown in the following equation:
When extracting texture features, the most commonly used is the gray value of the image, which leads to the Yiq space only to find the Y value, so that the texture information can be extracted later.
3.2.2. Edge direction Histogram
In this paper, we will propose a method of extracting texture information quickly, EHD (Edge histogram descriptor), which is the edge histogram descriptor, will use 5 digital filters, 3-9 shows.
These five digital filters are used to extract texture edge information that can be divided into vertical, horizontal, 45-degree, 135-degree, and no-direction five categories. In the image texture information extraction, the image will be divided into several quarters. Then each cell is divided into 3-9 of four sub-communities of equal size. Then each cell is divided into 3-9 of four sub-communities of equal size. Using G0 (I,J), G1 (I,j), G2 (I,j), G3 (I,J) respectively, the average gray value of four sub-communities in the (i,j) community. AV (k), AH (k), ad-45 (k), ad-135 (k) and and (k) represent the parameters of the average gray value of four sub-communities after the filter, and the values in each sub-cell are the parameters of the filter, where k is the range of 0 to 3 integers, representing four sub-communities within the cell. NV (i, J), NH (I,J), nd-45 (I,j), nd-135 (I,j) and Nnd (I,J) are the values of each direction determined in the first (i,j) cell. The calculation method is as follows:
Find the maximum value,
Then normalize all n values,
Through the above calculation formula, we can get the information of image edge in each cell. The texture information in Cedd is a 6-dimensional histogram, the meanings of each dimension information are: (0) No edge information, (1) No Direction edge information, (2) The horizontal direction of the edge information, (3) The vertical direction of the edge information, (4) 45 degrees direction of the edge information, (5) Edge information in 135-degree direction. Method 3-10, which determines the histogram area to which each cell texture information belongs, is as follows:
First set 4 thresholds: t0=14, check whether the cell contains edge information, t1=0.68, determine whether the cell contains no direction information, t2=t3=0.98, to determine whether the cell contains other four directions information. If the Mmax is greater than T0, then the cell contains texture information, if not greater than the non-containing texture information of the cell, then the 6-dimensional histogram of the first dimension of the value will be added 1. If the area has edge information, that is, Mmax is greater than or equal to T0, the value of the other parties to the information can be computed, as shown in 3-10. This schematic is a diverging pentagon, each vertex represents an edge direction category, and each cell calculates the NND, NH, NV, nd-45, nd-135 values that fall on the line of five points and the center origin respectively. The value of the center point is 1, and the value on the Pentagon boundary line is 0. If the n value is greater than the threshold on its corresponding edge direction category, it can be determined that the cell belongs to this edge direction category, it is conceivable that a cell can belong to several categories at the same time. Thus, there are the following methods: If the nnd is greater than T1, then the histogram contains no direction information of the region value plus 1; If NH is greater than T2, the histogram contains horizontal edge information of the region value plus 1, if NV is greater than T2, the histogram contains vertical edge information region value plus 1; if nd-45 is greater than T3 , the area value of the histogram contains 45 degrees of direction edge information plus 1, and if nd-135 is greater than T3, the region value with 135-degree directional edge information in the Histogram plus 1.
3.3. Cedd Features
The English name of Cedd is the color and edge directivity descriptor, which is the colour and edge orientation feature descriptor. It combines the color and texture information of the image to generate a 144-bit histogram. This feature extraction method can be divided into two sub-module systems, the extraction of color information is the color module, the texture information is extracted from the texture module, the two units of the specific algorithm has been in the 3.1 bars and 3.2 sections are described in detail. Cedd histogram information is composed of six regions, that is, 3.2 of the texture module, six regions are extracted 6-dimensional vector histogram, and then in each of these texture information added color module extracted 24-dimensional color information, so that the color and texture can be effectively combined, the final 6*24= 144-dimensional histogram information. It is shown in principle 3-11.
In the implementation process of the first picture into a number of residential areas, the number of cells is based on the image of the specific situation and the comprehensive decision of the computer capabilities, each image cell will pass through the texture module and Color module processing.
In the process of Texture module feature extraction, the cell will be divided into 4 sub-districts first. According to the YIQ calculation formula, the gray value of each pixel is calculated, and the average gray value of each sub-cell is obtained. After 5 digital filter filter, according to the principle of figure 3-10 to determine the sub-cell belongs to which texture information category.
In the color module, each image cell will be converted to HSV color space, the system will pass the average of the HSV channels in the cell through the 10-bins fuzzy filter output 10-dimensional vector and then through the 24-bins fuzzy filter. After 10-bins the fuzzy filter According to H is worth out of 10 color categories, when the 24-bins fuzzy filter will be based on the region of S and V to determine the re-classification of H to output 24-dimensional histogram.
Each cell of the image will be processed by the color module, and after processing, 24 data are added to each texture category belonging to the community, and the histogram is normalized and processed.
If only to the normalization of this step does not reflect the superiority of cedd, because the value of this contains a fractional part, to occupy a large amount of storage space. If it is quantified, the quantified integer value is convenient for storing and allows people to read the eigenvalues intuitively. Table 3-1 is the Cedd feature extraction of the quantization table, the quantization range is 0-7 integer. It can be seen that it is not an even quantization, the quantization range of each texture area in the vector is different, and the quantization level in the region is not equal to increment, the principle of it can refer to the literature.
Image retrieval: Cedd (color and Edge directivity descriptor) algorithm color and Edge directionality descriptor