Principles of Google Image Search

Source: Internet
Author: User

For this question, I have consultedAlgorithmColleagues in the Group shared their basic ideas:

This image search algorithm generally involves three steps:

1. there are many algorithms used to extract features from the target image and describe the image. Most of them are: sift, fingerprint algorithm function, bundling features algorithm, and hash function). You can also design different algorithms based on different images to extract image features by comparing local n-order moments of images.

2. encode the image feature information and encode the massive image as a search table. For the target image, downsampling can be performed for images with a large resolution, reducing the amount of computation before Image Feature Extraction and encoding.

3. similarity matching: uses the encoding value of the target image to calculate the global or local similarity of the image database in the image search engine. Based on the needed robustness, the threshold value is set, then pre-preserve the image with high similarity. At last, we should filter the best matching image, which should still use the feature detection algorithm.

Each step involves a lot of algorithm research, focusing on mathematics, statistics, image coding, signal processing and other theories.

The following is a simple implementation of Ruan Yifeng:

If you enter a Google image URL or directly upload an image, Google will find a similar image. The following figure shows Alyson Hannigan, an American actress.

After the upload, Google returns the following results:

What is the principle of this technology? How does the computer know that two images are similar?

According to Dr. Neal krawetz's explanation, the principle is very simple and easy to understand. We can use a quick algorithm to achieve basic results.

The key technology here is "perceptual hash algorithm". Its function is to generate a "fingerprint" (fingerprint) string for each image and then compare the fingerprints of different images. The closer the result is, the more similar the image is.

The following is a simple implementation:

Step 1: Reduce the size.

Reduce the image size to 8x8, with a total of 64 pixels. The purpose of this step is to remove the image details, retain only the basic information such as structure and brightness, and discard the image differences caused by different sizes and proportions.

Step 2: simplify the color.

Convert the reduced image to 64-level gray scale. That is to say, all pixels have only 64 colors in total.

Step 3: calculate the average value.

Calculate the average gray scale of all 64 pixels.

Step 4: Compare the gray scale of pixels.

Compare the gray scale of each pixel with the average value. If the value is greater than or equal to the average value, it is recorded as 1. If the value is smaller than the average value, it is recorded as 0.

Step 5: Calculate the hash value.

The comparison result in the previous step is combined to form a 64-bit integer, which is the fingerprint of the image. The order of the combination is not important, as long as all images are in the same order.

After obtaining the fingerprint, you can compare different images to see how many digits are different in the 64-bit format. In theory, this is equivalent to calculating Hamming distance ). If the number of different data bits does not exceed 5, the two images are very similar. If the number is greater than 10, the two images are different.

SpecificCodeFor implementation, see the imghash. py written in the Python language in wote. The code is very short, with only 53 lines. When used, the first parameter is the reference image, and the second parameter is the directory where other images are compared, the returned result is the number of different data bits (Hamming distance) between two images ).

The advantage of this algorithm is that it is simple and fast, and is not affected by the image size scaling. The disadvantage is that the image content cannot be changed. If you add a few texts to the image, it will not recognize it. Therefore, it is best to find the source Image Based on the thumbnail.

In practical applications, more powerful phash algorithms and sift algorithms are often used to recognize image deformation. As long as the deformation degree does not exceed 25%, they can match the source image. Although these algorithms are more complex, they share the same principle as the preceding simple algorithms. They are used to convert an image into a hash string before comparison.

ArticleSource: Lu songsong blog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.