Color Feature Extraction

Source: Internet
Author: User
Tags scale image

Color features are the most widely used visual features in image search. The main reason is that color is often very relevant to the objects or scenes contained in the image. In addition, compared with other visual features, color features have less dependence on the size, direction, and Angle of View of the image, thus providing high robustness.

The expression of Color Features for image search involves several issues. First, we need to select a suitable color space to describe color features. Secondly, we need to use a certain quantization method to express color features as vectors. Finally, A similarity (distance) standard is also defined to measure the color similarity between images. In this section, we will mainly discuss the first two issues, and introduce the Color Feature Representation methods such as color histograms, color moments, color sets, color aggregation vectors, and color-related graphs.

1 Color histogram

Color histogram is a color feature widely used in many image retrieval systems. It describes the proportions of different colors in the entire image, and does not care about the spatial location of each color, that is, the objects or objects in the image cannot be described. Color histograms are especially suitable for describing images that are difficult to automatically split.

Of course, color histograms can be based on different color spaces and coordinate systems. The most common color space is the RGB color space, because most digital images are expressed in this color space. However, the RGB space structure does not conform to people's subjective judgment on color similarity. Therefore, some people propose color histograms based on HSV space, Luv space, and lab space, because they are closer to people's subjective understanding of color. The HSV space is the most commonly used color space of a histogram. The three components represent the color (Hue), saturation (saturation), and value (value) respectively ).

To calculate the color histogram, You need to divide the color space into several small color ranges, and each cell is a bin of the histogram. This process is called Color quantization ). Then, the color histogram is obtained by calculating the number of pixels that the color falls into each cell. Color quantization involves many methods, such as vector quantization, clustering, or neural network. The most common method is to divide each component (dimension) of the Color Space Evenly. In contrast, the clustering algorithm takes into account the distribution of Image Color Features throughout the space, so as to avoid the sparse number of pixels in some BIN and make quantification more effective. In addition, if the image is in RGB format and the histogram is in HSV space, we can create a look-up table between the quantified RGB space and the quantified HSV space in advance ), this accelerates the histogram calculation process.

The Color quantization method described above may cause some problems. Imagine that the color histograms of the two images are almost the same, but the bin is staggered from each other.L1Distance or Euler's distance (see section 3.1.1) to calculate the similarity between the two, a small similarity value is obtained. To overcome this defect, we need to consider the similarity between similar but different colors. One method is to use the quadratic distance [4] (see section 3.1.3 ). Another method is to smoothly filter the color histogram in advance, that is, the pixels in each Bin also contribute to the adjacent bins. In this way, similarity between different colors also contributes to the similarity of the histogram.

The number of appropriate color cells (I .e. the bin of the histogram) and the Color quantization method are related to the performance and efficiency requirements of specific applications. Generally, the larger the number of color cells, the stronger the histogram's color resolution capability. However, a large number of color histograms in Bin not only increases the computing burden, but also makes it difficult to create indexes in large image libraries. In addition, for some applications, the use of a very fine color space division method may not be able to improve the retrieval effect, especially for those applications that cannot tolerate image errors or omissions. Another effective way to reduce the number of bin in the histogram is to use only the bin with the largest value (I .e. the largest number of pixels) to construct image features, because the bin that represents the main color can express the color of most pixels in the image. Experiments show that this method does not reduce the color histogram search effect. In fact, the color histogram is less sensitive to noise because we ignore the bin with smaller values, and sometimes it will make the search better. The two methods for constructing histograms using the main colors can be found in [5, 6.

2Color moment

Another very simple and effective color feature enables color moments proposed by Stricker and Orengo [7]. The mathematical basis of this method is that any color distribution in the image can be represented by its moments. In addition, because the color distribution information is mainly concentrated in the lower-order moment, only the first-order moment (mean), second-order moment (variance) and third-order moment (skewness) of the color are used) it is enough to express the color distribution of the image. Compared with color histograms, this method does not require feature vectorization. Therefore, the color moment of an image requires only nine components (three color components, each of which has three low-order moments), which is very concise compared with other color features. In practical applications, to avoid the weak resolution ability of low moment, color moments are often used together with other features, and generally used to filter and narrow the range (narrow down) before other features are used).

3 Color Set

To support quick search in large-scale image libraries, Smith and Chang proposed using color sets as an approximation of the color histogram [8]. They first convert the RGB color space into a color space (such as HSV space) of visual balancing, and then quantify the color space into several bins. Then, they use automatic color segmentation technology to divide the image into several regions, each of which is indexed by a color component of the quantified color space, so as to express the image as a binary color index set. In image matching, compare the distance between different image color sets and the spatial relationship between color areas (including the separation, inclusion, and intersection of areas, each of which corresponds to a different score ). Because the color set is expressed as a binary feature vector, we can construct a binary search tree to accelerate the search speed, which is very advantageous for large-scale image sets.

4 Color aggregation Vector

Because color histograms and color moments cannot express the spatial location of the image color, pass [9] proposes the color aggregation vector of the image ). It is an evolution of the color histogram. Its core idea is to divide the pixels of each bin of the histogram into two parts: if the area occupied by certain pixels in the bin is greater than the given threshold, the pixels in the bin are used as aggregation pixels, otherwise they are used as non-aggregation pixels. HypothesisαIAndβIRepresentINumber of aggregated pixels and number of non-aggregated pixels in Bin. The color aggregation vector of the image can be expressed as <(α1,β1),(α2,β2),..., (αN,βN)>. And <α1+β1,α2+β2,...,αN+βN> Is the color histogram of the image. Color aggregation vectors provide better search results than color histograms because they contain spatial information of color distribution.

5 Color correlation Diagram

Color correlogram is another way to express the color distribution of images [16]. This feature not only depicts the proportion of pixels in a certain color to the entire image, but also reflects the spatial correlation between different color pairs. Experiments show that color-related graphs are more efficient than color histograms and color aggregation vectors, especially those with consistent spatial relationships.

If we consider the correlation between any colors, the color correlation graph will become very complicated and huge (the spatial complexity isO (N2D)). A simplified variant is the color auto-correlogram, which only examines the spatial relationships between pixels of the same color, reducing the space complexityO (ND).

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.