Java Advanced (57)-Phash image registration algorithm based on perceptual hashing algorithm

?? After the submission of the graduation thesis, the teacher gave himself a task: image registration, that is, to give you two images, through the system to determine whether two images are the same image. Oneself as this aspect of small white, first to go online search for the corresponding detection method, of course, there is a ready API call the best, money does not matter.

?? The fundamental technology we use here is called the "Perceptual Hashing algorithm" (perceptual hash algorithm), which generates a "fingerprint" (fingerprint) string for each image, and then compares the fingerprints of different images. The closer the result is, the more similar the picture is.

Perceptual hashing algorithm

?? Here is one of the simplest implementations:

The first step is to reduce the size.

?? Reduce the image to 8x8 's size, a total of 64 pixels. The role of this step is to remove the details of the picture, only the structure, shading and other basic information, discard the different sizes, proportions of the picture differences.

The second step is to simplify the color.

?? Converts the zoomed-in image to a level 64 grayscale. That is, all pixels have a total of 64 colors.

The third step is to calculate the average.

?? Calculates a grayscale average of all 64 pixels.

Fourth step, compare the grayscale of the pixel.

?? The grayscale of each pixel is compared to the average. Greater than or equal to the average, recorded as 1, less than the average, recorded as 0.

Fifth step, calculate the hash value.

?? By combining the results of the previous step, you make up a 64-bit integer, which is the fingerprint of the image. The order of the combinations is not important, just make sure all the pictures are in the same order.

?? After getting the fingerprint, you can compare different pictures and see how many of the 64 bits are not the same. In theory, this equates to the calculation of "Hamming distance" (Hamming distance). If the data bits are not more than 5, the two images are similar, and if they are greater than 10, they are two different pictures.

?? The specific code implementation can be found in the wote written in Python language imghash.py. The code is short, only 53 lines. When used, the first parameter is the base picture, the second parameter is the directory in which the other images are compared, and the result is a different number of data bits (Hamming distance) between the two pictures.

?? The advantages of this algorithm are simple and fast, not affected by the size of the picture, the disadvantage is that the contents of the picture can not be changed. If you add a few words to the picture, it will not be recognized. So, it's best to use thumbnails to find out the original image.

?? In practical applications, more powerful phash algorithms and sift algorithms are often used to identify the deformation of images. As long as the degree of deformation does not exceed 25%, they can match the original image. Although these algorithms are more complex, the principle is the same as the simple algorithm above, that is, to first convert the image into a hash string, and then compare.

?? The mean hash is simple, but is very much affected by the mean value. For example, gamma correction of an image or histogram equalization affects the mean, which affects the final hash value. There is a more robust algorithm called Phash. It has the means to maximize the mean value. Use the discrete cosine transform (DCT) to get the low-frequency component of the picture.

?? Discrete cosine transform (DCT) is an image compression algorithm that transforms an image from a pixel domain to a frequency domain. Then the general image has a lot of redundancy and correlation, so after the conversion to the frequency domain, only a small part of the frequency component of the coefficient is not 0, most of the coefficients are 0 (or nearly 0).

Phash

?? Phash's work process is as follows:

- (1) Reduced size: Phash start with small pictures, but pictures larger than 8*8,32*32 are the best. The purpose of this is to simplify the calculation of the DCT, rather than reducing the frequency.
- (2) Simplify the color: the image is converted into grayscale image, further simplifying the calculation amount.
- (3) Calculate DCT: Calculate the DCT transform of the picture, get the 32*32 dct coefficient matrix.
- (4) Narrowing the DCT: Although the result of DCT is 32*32 size matrix, we just keep the 8*8 matrix in the upper left corner, this part presents the lowest frequency in the picture.
- (5) Calculate average: The mean value of the DCT is computed as the mean hash.
- (6) Calculate the hash value: This is the most important step, according to the 8*8 DCT matrix, set 0 or 1 of the 64-bit hash value, greater than or equal to the DCT mean is set to "1", less than the DCT mean is set to "0." Together, it makes up a 64-bit integer, which is the fingerprint of the image.

?? The results do not tell us the low frequency of authenticity, we can only roughly tell us the relative ratio of the average frequency. As long as the overall structure of the picture remains unchanged, the hash result value will remain unchanged. The effect of gamma correction or color histogram adjustment can be avoided.

?? As with the mean hash, the phash can also be compared using hamming distances. (only need to compare each position and calculate the number of different bits)

?? Let's take a look at the above theory in Java to do a demo version of the specific implementation:

`ImportJava.awt.Graphics2D;ImportJava.awt.color.ColorSpace;ImportJava.awt.image.BufferedImage;ImportJava.awt.image.ColorConvertOp;ImportJava.io.File;ImportJava.io.FileInputStream;ImportJava.io.FileNotFoundException;ImportJava.io.InputStream;ImportJavax.imageio.ImageIO;/** function: Java implementation of image similarity detection with Hamming distance * phash-like image hash.* Author:sun huaqiang* Based on:http://www.hackerfactor. com/blog/index.php?/archives/432-looks-like-it.html*/ Public class imagephash { Private intSize = +;Private intSmallersize =8; Public Imagephash() {initcoefficients (); }Private Imagephash(intSizeintSmallersize) { This. size = size; This. smallersize = smallersize; Initcoefficients (); }Private int Distance(string s1, string s2) {intCounter =0; for(intK =0; K < S1.length (); k++) {if(S1.charat (k)! = S2.charat (k)) {counter++; } }returnCounter }///Returns a ' binary string ' (like. 001010111011100010) which are easy-to-do-a hamming distance on. PrivateStringGethash(InputStream is)throwsException {bufferedimage img = Imageio.read (IS);/ * 1. Reduce size (zoom out). Like Average Hash, Phash starts with a small image. However, the image is larger than 8x8; 32x32 is a good size. This is really do to simplify the DCT computation and not because it's needed to reduce the high frequencies.*/IMG = Resize (img, size, size);/ * 2. Reduce color (simplifies colors). The image is reduced-a grayscale just to further simplify the number of computations.*/img = Grayscale (IMG);Double[] Vals =New Double[Size] [Size]; for(intx =0; x < Img.getwidth (); X + +) { for(inty =0; Y < Img.getheight (); y++) {Vals[x][y] = GetBlue (img, x, y); } }/ * 3. Compute the DCT (computational DCT). The DCT (discrete cosine Transform, discrete cosine transform) separates the image into a collection of frequencies and scalars. While JPEG uses an 8x8 DCT, this algorithm uses a 32x32 dct.*/ LongStart = System.currenttimemillis ();Double[] dctvals = APPLYDCT (Vals);//System.out.println ("Dct_cost_time:" + (System.currenttimemillis ()-start));/ * 4. Reduce the DCT. This is the magic step. While the DCT is 32x32, just keep the Top-left 8x8. Those represent the lowest frequencies in the picture.*// * 5. Compute the average value. Like the Average Hash, compute the mean DCT value (using only the 8x8 DCT low-frequency values and excluding the first ter m since the DC coefficient can is significantly different from the other values and would throw off the average). * / DoubleTotal =0; for(intx =0; x < smallersize; X + +) { for(inty =0; Y < smallersize; y++) {total + = Dctvals[x][y]; }} Total-= dctvals[0][0];DoubleAVG = Total/(Double) ((Smallersize * smallersize)-1);/ * 6. Further reduce the DCT. This is the magic step. Set the "0" or "1depending" on whether each of the DCT values are above or below the average value. The result doesn ' t tell us theactual low frequencies; It just tells us the very-roughrelative scale of the frequencies to the mean. The Resultwill not vary as long as the overall structure of the image remains the same; This can survive gamma and color histogram adjustments without a problem.*/String hash =""; for(intx =0; x < smallersize; X + +) { for(inty =0; Y < smallersize; y++) {if(X! =0&& Y! =0) {hash + = (Dctvals[x][y] > avg?"1":"0"); } } }returnHash }PrivateBufferedImageResize(BufferedImage image,intWidthintHeight) {BufferedImage Resizedimage =NewBufferedImage (width, height, bufferedimage.type_int_argb); Graphics2D g = resizedimage.creategraphics (); G.drawimage (Image,0,0, width, height,NULL); G.dispose ();returnResizedimage; }PrivateColorconvertop Colorconvert =NewColorconvertop (Colorspace.getinstance (Colorspace.cs_gray),NULL);PrivateBufferedImageGrayscale(BufferedImage img) {Colorconvert.filter (IMG, IMG);returnimg }Private Static int GetBlue(BufferedImage img,intXintY) {return(Img.getrgb (x, y)) &0xFF; }//DCT function stolen from http://stackoverflow.com/questions/4240490/ Problems-with-dct-and-idct-algorithm-in-java Private Double[] C;Private void initcoefficients() {c =New Double[Size]; for(intI=1; i<size;i++) {c[i]=1; } c[0]=1/MATH.SQRT (2.0); }Private Double[][]APPLYDCT(Double[] f) {intN = size;Double[] F =New DoubleN [N]; for(intu=0; u<n;u++) { for(intv=0; v<n;v++) {Doublesum =0.0; for(intI=0; i<n;i++) { for(intj=0; j<n;j++) {Sum+=math.cos ((2*i+1)/(2.0*n) *u*math.pi) *math.cos ((2*j+1)/(2.0*n)) *v*math.pi) * (F[i][j]); }} sum*= ((C[u]*c[v])/4.0); F[U][V] = sum; } }returnF }/** * * @param img1 * @param img2 * @param TV * @return Boolea n * / Public Boolean Imgchk(String img1, String Img2,intTV) {Imagephash p =NewImagephash (); String Image1; String Image2;Try{Image1 = P.gethash (NewFileInputStream (NewFile (IMG1)); Image2 = P.gethash (NewFileInputStream (NewFile (IMG2));intDT = P.distance (Image1, image2); System.out.println ("["+IMG1 +"] : ["+ Img2 +"] Score is "+ DT);if(DT <= TV)return true; }Catch(FileNotFoundException e) {E.printstacktrace (); }Catch(Exception e) {E.printstacktrace (); }return false; } Public Static void Main(string[] args) {Imagephash p =NewImagephash (); String ImagePath ="c:/users/shq/desktop/image/"; System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"2.jpg",Ten)); System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"3.jpg",Ten)); System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"4.jpg",Ten)); System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"5.jpg",Ten)); System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"6.png",Ten)); System.out.println (P.imgchk (imagepath+"1.jpg", imagepath+"7.jpg",Ten)); System.out.println (P.imgchk (imagepath+"2.jpg", imagepath+"3.jpg",Ten)); }}`

Test results

?? The results show that the greater the Hamming distance indicates that the larger the picture difference, if the different data bits are not more than 5, the two images are very similar; if they are greater than 10, they are two distinct pictures. From the results can be seen 1, 5, 6, 7 is similar to the picture, 1, 2, 3 is too big, is two different pictures.

Attached test picture

Figure 1 1.jpg

Figure 2 2.jpg

Figure 3 3.jpg

Figure 4 4.jpg

Figure 5 5.png

Figure 6 6.jpg (thumbnail of Figure 1)

Figure 7 7.jpg (thumbnail of Figure 1)

Java Advanced (57)-Image registration based on perceptual hashing algorithm