Previously, we saw a deployment. Download the java source code for free at: http://download.csdn.net/detail/yjflinchong/4239243,you can use their images ~~
You can find any picture in the following program.
Topic content excerpt:
Google "search for similar images": You can use an image to search for all similar images on the Internet.
Open the Google image search page:
Click to upload an original angelababy image:
After you click search, Google will find similar images, and the higher the image similarity, the higher the top.
What is the principle of this technology? How does the computer know that two images are similar?
According to Dr. Neal Krawetz, the key technology for similar image search is Perceptualhash algorithm ), it generates a "fingerprint" (fingerprint) string for each image, and then compares the fingerprints of different images. The closer the result is, the more similar the image is.
The following is the simplest Java Implementation:
Preprocessing: Reading Images
Step 1: Reduce the size.
Reduce the image size to 8x8, with a total of 64 pixels. The purpose of this step is to remove the image details, retain only the basic information such as structure and brightness, and discard the image differences caused by different sizes and proportions.
Step 2: simplify the color.
Convert the reduced image to 64-level gray scale. That is to say, all pixels have only 64 colors in total.
Step 3: calculate the average value.
Calculate the average gray scale of all 64 pixels.
Step 4: Compare the gray scale of pixels.
Compare the gray scale of each pixel with the average value. If the value is greater than or equal to the average value, it is recorded as 1. If the value is smaller than the average value, it is recorded as 0.
Step 5: Calculate the hash value.
The comparison result in the previous step is combined to form a 64-bit integer, which is the fingerprint of the image. The order of the combination is not important, as long as all images are in the same order.
After obtaining the fingerprint, you can compare different images to see how many digits are different in the 64-bit format. In theory, this is equivalent to calculating Hammingdistance ). If the number of different data bits does not exceed 5, the two images are very similar. If the number is greater than 10, the two images are different.
You can put several images together and calculate their Hamming distance comparison to see if the two images are similar.
The advantage of this algorithm is that it is simple and fast, and is not affected by the image size scaling. The disadvantage is that the image content cannot be changed. If you add a few texts to the image, it will not recognize it. Therefore, it is best to find the source Image Based on the thumbnail.
In practical applications, more powerful pHash algorithms and SIFT algorithms are often used to recognize image deformation. As long as the deformation degree does not exceed 25%, they can match the source image. Although these algorithms are more complex, they share the same principle as the preceding simple algorithms. They are used to convert an image into a Hash string before comparison.
Use OpenCV to open the image (it seems that there is no opencv, which is hard to do)
// Win32TestPure. cpp: defines the entry point of the console application. # Include "stdafx. h "// # include <atlstr. h> // CString, CEdit # include "opencv2 \ opencv. hpp "# include
Well, this hash table makes sense only when enough images are added. This program provides a rough model, and the details are not elaborated (hash_map is used for the first time ). I hope you will give some comments.