Perception hash algorithm-find similar images

Source: Internet
Author: User
ArticleDirectory
    • Step 1: Reduce the image size
    • Step 2 convert to grayscale image
    • Step 3 Calculate the gray Average Value
    • Step 4 compare the gray scale of pixels
    • Step 5 Calculate the hash value
    • Step 6 compare image fingerprints
Google Image Search

In Google image search, users can upload an image. Google displays the same or similar images on the Internet.

For example, if you want to upload a photo, try the following:

Principles

According to Dr. Neal krawetz's article, the key technology for implementing this function is "perceptual hash ".Algorithm"(Perceptual hash algorithm), which means to generate a fingerprint for the image (in string format). The more similar the two images are, the more similar the two images are. but the key is how to calculate the "fingerprint" based on the image? The following describes the principle in the simplest steps:

Step 1: Reduce the image size

Reduce the image size to 8x8, with a total of 64 pixels. This step removes the differences between the image size and the image proportion, and only retains basic information such as the structure, brightness, and brightness.

Step 2 convert to grayscale image

Convert the reduced image to a 64-level grayscale image.

Step 3 Calculate the gray Average Value

Calculate the average gray scale of all pixels in an image

Step 4 compare the gray scale of pixels

Compare the gray scale of each pixel with the average value. If it is greater than or equal to the average value, it is recorded as 1, and if it is less than the average value, it is recorded as 0.

Step 5 Calculate the hash value

Combining the comparison results in the previous step constitutes a 64-bit binary integer, which is the fingerprint of the image.

Step 6 compare image fingerprints

After obtaining the image fingerprint, we can compare the fingerprints of different images and calculate the number of digits in the 64-bit format. if the number of different data digits does not exceed 5, the two images are very similar. If the number is greater than 10, they are two different images.

Code Implementation (C # Version)

Below I will use C # code to implement it according to the steps described in the previous section.

Using system; using system. io; using system. drawing; namespace similarphoto {class similarphoto {image sourceimg; Public similarphoto (string filepath) {sourceimg = image. fromfile (filepath);} public similarphoto (Stream stream) {sourceimg = image. fromstream (Stream);} Public String gethash () {image = performancesize (); byte [] grayvalues = performancecolor (image); byte average = calcaverage (grayvalues ); String reslut = computebits (grayvalues, average); Return reslut;} // Step 1: reduce size to 8*8 private image reducesize (INT width = 8, int Height = 8) {image = sourceimg. getthumbnailimage (width, height, () =>{ return false ;}, intptr. zero); Return image;} // Step 2: Reduce color private byte [] reducecolor (image) {Bitmap bitmap = new Bitmap (image ); byte [] grayvalues = new byte [Image. width * image. height]; for (INT x = 0; x <image. width; X ++) for (INT y = 0; y <image. height; y ++) {color = bitmap. getpixel (x, y); byte grayvalue = (byte) (color. R * 30 + color. g * 59 + color. (B * 11)/100); grayvalues [x * image. width + Y] = grayvalue;} return grayvalues;} // Step 3: Average the colors private byte calcaverage (byte [] values) {int sum = 0; for (INT I = 0; I <values. Length; I ++) sum + = (INT) Values [I]; return convert. tobyte (sum/values. length);} // Step 4: Compute the bits private string computebits (byte [] values, byte averagevalue) {char [] result = new char [values. length]; for (INT I = 0; I <values. length; I ++) {If (Values [I] <averagevalue) result [I] = '0'; else result [I] = '1 ';} return new string (result);} // compare hash public static int32 calcsim Ilardegree (string a, string B) {If (A. length! = B. length) throw new argumentexception (); int COUNT = 0; For (INT I = 0; I <. length; I ++) {if (a [I]! = B [I]) Count ++;} return count ;}}}

Google servers have tens of billions of images, and the number of images on my computer is certainly incomparable. However, I have previously done crawler programs and there are pictures of more than 40,000 people in my computer, take them for comparison! I calculated the "Fingerprints" of these images and put them in a TXT file. The format is as follows.

Use ASP. net writes a simple page, allows the user to upload an image, the background calculates the fingerprint of the image, and compares it with the fingerprint of each image in the TXT text, sorting out the results displayed on the page, the effect is as follows:

Address: http://www.cnblogs.com/technology/archive/2012/07/12/Perceptual-Hash-Algorithm.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.