From: http://www.coding123.net/article/20120822/csharp-google-similar-image-search-algorithm.aspx
Try:
Principles
According to Dr. Neal krawetz's article, the key technology used to implement this function is perceptual hash algorithm, which generates a fingerprint (string format) for the image ), the more similar the two images are, the more similar the two images are. but the key is how to calculate the "fingerprint" based on the image? The following describes the principle in the simplest steps:
Step 1: Reduce the image size
Reduce the image size to 8x8, with a total of 64 pixels. This step removes the differences between the image size and the image proportion, and only retains basic information such as the structure, brightness, and brightness.
Step 2 convert to grayscale image
Convert the reduced image to a 64-level grayscale image.
Step 3 Calculate the gray Average Value
Calculate the average gray scale of all pixels in an image
Step 4 compare the gray scale of pixels
Compare the gray scale of each pixel with the average value. If it is greater than or equal to the average value, it is recorded as 1, and if it is less than the average value, it is recorded as 0.
Step 5 Calculate the hash value
Combining the comparison results in the previous step constitutes a 64-bit binary integer, which is the fingerprint of the image.
Step 6 compare image fingerprints
After obtaining the image fingerprint, we can compare the fingerprints of different images and calculate the number of digits in the 64-bit format. if the number of different data digits does not exceed 5, the two images are very similar. If the number is greater than 10, they are two different images.
Code Implementation (C # Version)
Below I will use C # code to implement it according to the steps described in the previous section.
-Shrink
C#
Code using system;
Using system. IO;
Using system. drawing;
Namespace similarphoto
{
Class similarphoto
{
Image sourceimg;
Public similarphoto (string filepath)
{
Sourceimg = image. fromfile (filepath );
}
Public similarphoto (Stream)
{
Sourceimg = image. fromstream (Stream );
}
Public String gethash ()
{
Image image = performancesize ();
Byte [] grayvalues = performancecolor (image );
Byte average = calcaverage (grayvalues );
String reslut = computebits (grayvalues, average );
Return reslut;
}
// Step 1: Reduce size to 8*8
Private image performancesize (INT width = 8, int Height = 8)
{
Image image = sourceimg. getthumbnailimage (width, height, () => {
Return false;}, intptr. Zero );
Return image;
}
// Step 2: Reduce color
Private byte [] reducecolor (image)
{
Bitmap bitmap = new Bitmap (image );
Byte [] grayvalues = new byte [image. Width * image. Height];
For (INT x = 0; x <image. width; X ++)
For (INT y = 0; y <image. height; y ++)
{
Color color = bitmap. getpixel (x, y );
Byte grayvalue = (byte) (color. R * 30 + color. g * 59 + color. B * 11)/100 );
Grayvalues [x * image. Width + Y] = grayvalue;
}
Return grayvalues;
}
// Step 3: Average the colors
Private byte calcaverage (byte [] values)
{
Int sum = 0;
For (INT I = 0; I <values. length; I ++)
Sum + = (INT) Values [I];
Return convert. tobyte (sum/values. Length );
}
// Step 4: Compute the bits
Private string computebits (byte [] values, byte averagevalue)
{
Char [] result = new
Char [values. Length];
For (INT I = 0; I <values. length; I ++)
{
If (Values [I] <averagevalue)
Result [I] = '0 ';
Else
Result [I] = '1 ';
}
Return new string (result );
}
// Compare hash
Public static int32 calcsimilardegree (string a, string B)
{
If (A. length! = B. length)
Throw new argumentexception ();
Int COUNT = 0;
For (INT I = 0; I <A. length; I ++)
{
If (A [I]! = B [I])
Count ++;
}
Return count;
}
}
}
Google servers have tens of billions of images, and the number of images on my computer is certainly incomparable. However, I have previously done crawler programs and there are pictures of more than 40,000 people in my computer, take them for comparison! I calculated the "Fingerprints" of these images and put them in a TXT file. The format is as follows.
Use ASP. net writes a simple page, allows the user to upload an image, the background calculates the fingerprint of the image, and compares it with the fingerprint of each image in the TXT text, sorting out the results displayed on the page, the effect is as follows:
Source: http://www.cnblogs.com/technology/archive/2012/07/12/Perceptual-Hash-Algorithm.html