Calculating the fast Hamming distance

Source: Internet
Author: User
Hamming distance, as a method of measuring feature distance, has been applied in many occasions, and its main idea is to find the difference between the two characteristics of the size, it can be said that the similarity.

I used in the image processing, the project needs to calculate the gradient direction of the image, I chose four directions, so you can use two-bit binary representation, respectively, 0,1,2,3, that is, 00,01,10,11, these four cases. In this way, I can put, for example, four adjacent points, the corresponding gradient feature is combined into a eigenvector, as shown


Only a single byte of space is required to represent a feature. So, let me use this feature to describe two graphs, assuming a, B, to get two features, Featurea,featureb, in the assumption that the image size of 100*100 8bit grayscale image, select the horizontal direction of four pixels, then I can get 100*25 single-byte characterization. Next, I would like to measure whether these two graphs are similar, it is necessary to use the Hamming distance (other distances can also, here bluntly Hamming distance). If two images are similar in this feature, it means that the corresponding bits of their characteristics should be as much consistent as possible, that is, Featurea ^ Featureb, the number of different features or results of 1 is as small as possible. This involves how we calculate the number of 1. This is in Leetcode and Jian refers to the offer and other books are similar to the written test face.

The conventional solution is to shift the judgment is not 1, is the count, it should be noted that the given number should be judged (here with the TMP) symbol, otherwise it may die loop.

int countone (int tmp)
{
int count=0;
if (0==tmp)
return count;
else if (tmp>0)
{
    while (tmp)
  {
  if (tmp&1)
    ++count;
  tmp=tmp>>1;
  }
}
else
{
tmp=-tmp;
Count=1;
while (TMP)
  {
  if (tmp&1)
    ++count;
  tmp=tmp>>1;
  }
}
return count;
}

This is still inconvenient, we can not shift the TMP:

int CountOneVersion2 (int tmp)
{
int flag=1;
int count=0;
while (flag)
{
if (flag&tmp)
++count;
flag=flag<<1;
}
return count;
}


This makes it less difficult to judge the signs.

There is also a solution, that is, considering that the TMP each time minus one, the last 1 will change, so we put the remaining reservations, next time minus one, know that TMP is zero position, so you can know the number of 1.

int CountOneVersion3 (int n)
{
int count=0;
while (n)
{
++count;
n=n& (n-1);
}
return count;
}


The above is the use of displacement method, we can also not shift, that is the following fast Hamming distance calculation problem. Let's take three numbers first, and here we consider unsigned char, which is also a byte-case. Respectively is aa=85, namely 01010101,bb=51, namely 00110011,cc=15, namely 00001111; The main idea is to calculate 1 and, (1) a row is calculated adjacent to two locations and, the resulting result requires a maximum of two bits, and then (2) a row in the calculation of adjacent four positions and , the result is up to four bits. (3) Add the high four-bit and low four-bit, and get the number of 1.


unsigned char a,b,ch,d;
Const unsigned char AA =;
Const unsigned char BB = Wuyi;
Const unsigned char CC =;

  A = tmp; B = a&aa; Ch = (a>>1) &aa;//(1)
  D = b+ch; B = D &BB; Ch = (d>>2) &bb;//(2)
  D = b+ch; B = D & CC; Ch = (d>>4) &cc;//(3)
SS + = B+ch;


Should be implemented on the hardware, so each clock cycle to calculate, to do performance optimization, through this bit operation, can significantly improve the speed, this cell repeatability is quite high, so the performance will improve a lot.

Today suddenly thought, summed up, do not write themselves, never remember, remember.

Please correct me if there is something wrong with it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.