Similarity Image Search principle Three (color histogram-c++ implementation)

Source: Internet
Author: User
Tags pow ranges

Image Color histogram can be used for image retrieval, adapt to the same color, and can have the translation, scaling, rotation invariance image retrieval, of course, these three characteristics than sift or surf stability, in addition to the biggest limitation is that if the shape of the same content, but the color is different, the result is not to be searched. However, it achieves better results in some cases.

Color histogram two methods of calculation:

Color histogram of color image, there can be two ways of processing, the effect should be similar.

       The first is to partition each channel of the pixel, the maximum pixel value for each channel is255, can be divided equally8, -or -so that the range of each channel is0~15 (in -as an example, of course, the smaller the equal, the greater the range of pixel values, the more accurate, but the larger the image dimension, the time-consuming complexity of large). So that the three channel to get the image dimension is16*16*16=4096Dimension(from[0,0,0]always to[15,15,15]). In the code we used to get its subscript operation asi+ (j<<4) + (k<<8)is equal toi+j*16+k*16*16. For example, a pixel is[4,1,20], then there will behist[4+1*16+20*16*16]++;

The second method is to calculate the number of pixel values for each channel separately, such as a pixel value of [4,1,20], then there is bhist[4] ++;ghist[1]++; rhist[20]++; this will get 3 a the one-dimensional vector of the dimension, which can then be superimposed.

Measure of distance

distances are usually measured in Euclidean distance, Pearson correlation coefficient, and cosine distance. But here Baidu Encyclopedia said in the histogram similarity measurement, Babbitt distance effect is the best . I did a simple test here, and found that the Euclidean distance is really poor, this may be the reason for example, when [5,5] and [the] should be similar, but the European distance to find that they are very large distance. In addition, the cosine distance here, the test effect is also OK, also can be used.

Babbitt Distance: Also called Pap coefficient. Used to measure two discrete probability distributions. It is often used to measure the separation between classes in the classification. The calculation formula is as follows:


whichp, p' represents the source and candidate image histogram data, respectively, for each of the sameIThe result of the addition of the data point product after the square is the image similarity value (PAP factor factor value), the range is0to the1between. Why is it to1between, this is the problem of mathematics, will not be investigated. Whenp (i) ==p ' (i) for all I, the result will be1. p (i)with theP ' (i)are in0~1between. p (i)The number of times that the pixel value appears and the number of pixels divided by the total is a probability that can be seen in the code.

Code:

Calculation Method One:

(1) get the color histogram:

Three-dimensional histogram way one void Gethistogram (Mat &image, int *histvalue) {Matnd hist;       In CV with cvhistogram *hist = cvcreatehistint dims = 3;float r_hranges[] = {0, 255};float g_hranges[] = {0, 255};float B_hra Nges[] = {0, 255};const float *ranges[] = {r_hranges, g_hranges, b_hranges};   This is required for the const type int size[3] = {+, +, 16};int channels[] ={0, 1, 2};   Represents r G Channel 2 for B channel//computed image histogram Calchist (&image, 1, channels, Mat (), hist, dims, size, ranges);    CV is cvcalchistfor (int i = 0; i <; i++) {for (int j = 0; J <; J + +) {for (int k = 0; k <; k++) {float value = Hist.at<float> (i,j,k);   Note that the value of the histogram is in the float type    cv with cvqueryhistvalue_1dint realvalue = saturate_cast<int> (value); int index = i + (j <<4) + (k<<8); Histvalue[index] = Realvalue;}}}


(2) code for three distance measurements

European range Float getdistance (int *sur, int *dst) {Float sum = 0;for (int i = 0; i < Maxhistvalue; i++) {sum + = POW (sur[i]-dst[ i]+0.0,2);} return sqrt (sum);} Cosine distance float getcosdistance (int *sur, int *dst) {float Sursum = 0, dstsum = 0, sum = 0;for (int i = 0; i < Maxhistvalue; i++) {sursum + = POW (sur[i]+0.0,2);d stsum + = POW (dst[i]+0.0,2); sum + = Sur[i]*dst[i];} Sursum = sqrt (sursum);d stsum = sqrt (dstsum); return sum/(sursum*dstsum);} The bar distance, which  needs to be divided by the total number of elements  //NOTE: In the similarity comparison of the color histogram, the Babbitt distance effect is best in the float getpsdistance (int *sur, INT*DST, const float stotal, Const float dtotal) {Float sum = 0;for (int i = 0; i < Maxhistvalue; i++) {sum + = sqrt ((sur[i]/stotal) * (Dst[i]/dtotal));} return sum;}

Test Picture:


Cosine Result:


Babbitt Distance Results:


where " i-j ", &NBSP; i stands for personi &NBSP; J stands for personi person phash

visible from the results, for Person6, very similar, but the cosine result is not good, and the Babbitt distance very well, moreover the Babbitt distance to the original image is not 1 is due to the loss of precision in the calculation process.

Calculation Method Two:

(1) Get color histogram

Three-dimensional histogram mode two void getHistogram2 (Mat &image, int **histvalue) {for (int i = 0; i < image.rows; i++) {for (int j = 0; J &L T Image.cols; J + +) {histvalue[0][image.at<vec3b> (I,J) [0]] + +; Histvalue[1][image.at<vec3b> (I,J) [1]] + +; Histvalue[2][image.at<vec3b> (I,J) [2]] + +;}}}

(2) code for three distance measurements

European range Float getdistance (int **sur, int **dst) {Float sum = 0;for (int i = 0; i < 3; i++) {for (int j = 0; J < N; j + +) {Sum + = POW (sur[i][j]-dst[i][j]+0.0,2);}} return sqrt (sum);} Cosine distance float getcosdistance (int **sur, int **dst) {float Sursum = 0, dstsum = 0, sum = 0;for (int i = 0; i < 3; i++) {for ( int j = 0; J < 256; J + +) {sursum + = POW (sur[i][j]+0.0,2);d stsum + = POW (dst[i][j]+0.0,2); sum + = Sur[i][j]*dst[i][j];}} Sursum = sqrt (sursum);d stsum = sqrt (dstsum); return sum/(sursum*dstsum);} The bar distance, which  needs to be divided by the total number of elements  //NOTE: In the similarity comparison of the color histogram, the Babbitt distance effect is best in the float getpsdistance (int **sur, INT**DST, const float stotal, Const float dtotal) {Float sum = 0;for (int i = 0; i < 3; i++) {for (int j = 0; J < N; j + +) {sum + = sqrt (sur[i][j]/sto TAL) * (Dst[i][j]/dtotal));}} return SUM/3;    Because there are three of them}

Cosine Result:


Babbitt Distance Results:


where " i-j ", &NBSP; i stands for personi &NBSP; J stands for personi person phash

Full Source ( uniform hash, perceptual hash, color histogram ) :

http://download.csdn.net/detail/lu597203933/8710535

Reference documents:

1 : http://blog.csdn.net/jia20003/article/details/7771651

2 : http://blog.csdn.net/luoweifu/article/details/8690835

3 : http://baike.baidu.com/view/10343198.htm Babbitt Distance

Similarity Image Search principle Three (color histogram-c++ implementation)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.