Recognition of similar images using python

Source: Internet
Author: User
This article will share with you the summary of using Python to achieve image similarity recognition. The Code utility pil module compares the similarity between the two images. Based on the actual practicality, the code is short but the effect is good, and it is very reliable. Introduction

After reading articles related to image recognition in python on the internet, I really feel that python is too powerful. So I will summarize these articles and build my own knowledge system.
Of course, as a branch of computer science, the topic of image recognition cannot be clearly explained in a few simple words in this article. Therefore, this article only serves as a popular approach to basic algorithms.

If any error occurs, please forgive me and give me more advice.

The referenced articles and image sources are listed one by one at the bottom.

The code used in this article will be provided at the github address below.

Install related libraries

Python is mainly used as a library for image processing, such as openCV (C ++, which provides python interfaces) and PIL. However, since PIL has stopped for a long time, python3.x is not supported, therefore, it is recommended to use PIL-based pillow. This article also conducts experiments in python3.4 and pillow environments.

Official website of openCV

As for opencv, it will be used in human face recognition, but this article will not involve it. In the future of this column, we will talk about openCV's face recognition and the python image crawler, if you are interested, you can follow this column.


To identify two similar images, let's talk about the process from the perspective of sensibility? First, we will differentiate the two types of photos, for example, landscape photos or character photos. In the landscape, is the desert or the ocean? In the figure, are both Chinese faces or melon faces ...... Haha ......).

From the machine's point of view, this is also the case. First, the image features are recognized and then compared.

Obviously, in the absence of trained computers (that is, building models), it is difficult for computers to distinguish what is the ocean and what is the desert. However, computers can easily recognize the pixel value of an image.

Therefore, color features are most commonly used in image recognition. (Other common features include texture features, shape features, and spatial relationship features)

It is divided

  1. Histogram
  2. Color Set
  3. Color moment
  4. Aggregation Vector
  5. Correlation Diagram

Histogram Calculation

Here we will briefly describe it with a histogram.

First, I want to borrow a picture of my love butterfly,

The two images are similar to each other.

In python, you can obtain the histogram data by using the histogram () method of the Image object, but the result returned by this method is a list. To obtain the visualized data, you need to use matplotlib, this is because we mainly introduce algorithm ideas. We will not introduce matplotlib here.

Yes, we can obviously find that the histograms of the two images are approximately coincident. Therefore, the Histogram can be used to determine whether two images are similar, and the coincidence degree of the Histogram can be calculated.

The calculation method is as follows:

Gi and si respectively refer to the I point of the two curves.

The final calculation result is the degree of similarity.

However, this method has an obvious weakness, that is, it looks at the global distribution of colors, and cannot describe the local distribution of colors and the location of colors.

That is to say, if an image is mainly blue, the content is a blue sky, and the content of another image is also blue, but the content is the girl wearing a blue dress, this algorithm may also consider the two images as similar.

One way to alleviate this weakness is to use the crop method of the Image to divide the Image equally, and then calculate the similarity of the Image separately.

Distance between image fingerprint and Hamming

Before introducing the following methods for determining similarity, add some concepts. The first is the image fingerprint.

Like a human fingerprint, an image fingerprint is a symbolic identity. In simple terms, an image fingerprint is a set of binary numbers obtained after calculation based on a certain hash algorithm.

Here, we can introduce the concept of Hamming distance.

If a group of binary data is 101 and the other group is 111, it is clear that the second DATA 0 in the first group can be changed to 111 in the second group, therefore, the Hamming distance between the two groups of data is 1.

To put it simply, the Hamming distance is the number of steps required to convert a set of binary data into another set of data. Obviously, this value can measure the difference between the two images. The smaller the Hamming distance, the higher the similarity. The Hamming distance is 0, indicating that the two images are identical.

The following three hash algorithms are used to calculate the Hamming distance.

Average Hash (aHash)

This algorithm is implemented based on comparing each pixel and average value of grayscale images.

General steps

1. Scale the Image. You can use the resize (size) of the Image object to change the Image size. Generally, the size is 8x8 and the value is 64 pixels.
2. Convert to grayscale
Grayscale conversion algorithm.
1. Floating Point Algorithm: Gray = Rx0.3 + Gx0.59 + Bx0.11
2. Integer method: Gray = (Rx30 + Gx59 + Bx11)/100
3. Shift Method: Gray = (Rx76 + Gx151 + Bx28)> 8;
4. Average Method: Gray = (R + G + B)/3;
5. Green only: Gray = G;

In python, the convert ('l') method of the Image object can be directly converted to a grayscale Image.

3. Calculate the average value: calculate the average value of all pixels in the image after grayscale processing.
4. Compare the pixel gray value: traverse each pixel of the grayscale image. If the value is greater than the average value, the value is 1; otherwise, the value is 0.
5. obtain the information fingerprint: 64 bits are combined, and the order is consistent at will.
Finally, compare the fingerprints of the two images to obtain the Hamming distance.

Perception hash algorithm (pHash)

The average hash algorithm is too strict and not accurate enough. It is more suitable for searching for thumbnails. To obtain more accurate results, you can select the perception hash algorithm, which uses DCT (discrete cosine transformation) to reduce the frequency

General steps:

  1. Reduced image size: 32*32 is a good size, which facilitates DCT computing.
  2. Convert to grayscale: Convert the scaled image to a grayscale image of level 256. (For specific algorithms, see the average hash algorithm steps)
  3. Calculation of DCT: a set of DCT image separation component rates
  4. Reduced DCT: the DCT value is 32*32, and the 8*8 in the upper left corner is retained. These represent the lowest frequency of the image.
  5. Calculate the average value: calculate the average value of all pixels after reduced DCT.
  6. Further reduce DCT: If the value is greater than the average value, the record is 1; otherwise, the record is 0.
  7. Obtain the information fingerprint: 64 Information digits are combined, and the order is consistent at will.

Finally, compare the fingerprints of the two images to obtain the Hamming distance.

Here is an introduction to others' DCT and the calculation method (the method of discrete cosine transformation)

Compared with pHash, dHash provides a much faster speed. Compared with aHash, dHash achieves better performance under almost the same efficiency. It is implemented based on gradient.


  1. Scale down the image to 9*8, with 72 pixels per time
  2. Convert to grayscale: Convert the scaled image to a grayscale image of level 256. (For specific algorithms, see the average hash algorithm steps)
  3. Calculate the difference value: the dHash algorithm operates between neighboring pixels, so there are eight different differences between nine pixels in each line. A total of eight rows produce 64 Difference values.
  4. Fingerprint acquisition: If the left pixel is brighter than the right pixel, the record is 1; otherwise, it is 0.

Finally, compare the fingerprints of the two images to obtain the Hamming distance.


These algorithms are the basis for recognizing similar images. Obviously, sometimes the human similarity in the two images is more important than the overall color similarity. Therefore, we sometimes need to perform face recognition,
Then perform local hash in the face area, or perform other preprocessing and then hash. This article does not introduce other knowledge.

Next we will introduce how to use opencv and trained models for face recognition.

The implementation of the algorithm in this article is below, just click Connect next to the connection

Github Repository

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.