Content-based Image retrieval technology

Source: Internet
Author: User

image Retrieval is the essence of Image feature extraction and feature-based matching technology , image features include the image of the text features, visual features, so-called image text features are related to the image of the text information, the name of the image, the annotated text, etc. At present, the more mature application in the network environment of image retrieval system such as Google, Baidu and so on belong to this category. The visual feature of image refers to the visual information possessed by the image itself, and it can be further divided into general visual features and domain features, such as color, texture, shape and so on, which belongs to the general features of the image, while the spectral characteristics belong to the unique features of the remote sensing images in geography science.

From the perspective of development and evolution, the image retrieval system can be divided into two categories, the first is based on the image text feature retrieval,Tbir, the second class is based on the image visual features of the content retrieval,CBIR .

The traditional Tbir technology is applied to the early image retrieval, and its research is mainly carried out in the database field, first, the image is annotated manually, then the text-based database management system is used to retrieve it. This method is easy to be widely used, but it relies on the human annotation of the image, when the number of images increases sharply, the manual annotation method requires too much work, and different people to the same image of the understanding angle is also different, the subjectivity of the annotations lead to the retrieval of recall rate is low.

Content-based retrieval has become a research hotspot, which is to retrieve images with similar characteristics in the database directly according to various physical features inside the image.

Content-based image retrieval has the following characteristics compared with traditional text-based retrieval methods:

(1) Breakthrough the keyword search based on the limitations of text features, directly from the media content to extract feature clues.

(2) Search methods are varied. Content-based image retrieval can provide browsing methods, instance-based retrieval methods, sketch-based retrieval methods, and so on.

(3) Human-computer interactive retrieval. The content-based image retrieval system usually uses the parameter adjustment method, the cluster analysis method, the probabilistic learning method and the neural network method to capture and establish the correlation between the low-level image feature and the high-layer semantics through the human-computer interaction, that is, the correlation feedback technology.

(4) Similarity matching search: Content-based retrieval is a matching algorithm to match the characteristics of the input image with the characteristic metadata in the feature library, and a set of initial results satisfying a certain similarity is arranged according to the similarity size and provided to the user.

Although Cbir has made great progress with respect to Tbir, this machine-based understanding of the various features of the image does not replace the text-based description in some scenarios, such as an image may contain a moral or an event, but simply from the content is not able to obtain all this, If the annotation of the image can be applied to content-based retrieval, it will undoubtedly improve the retrieval precision greatly.

Content-based image retrieval

The content of the image includes physical features such as visual information of the image, and also includes high-level semantic features brought by visual features. the physical characteristics belong to the low-level visual information, mainly including color, texture, shape, semantic information belongs to the image of high-rise visual information, including objects, spatial relations, scene, behavior, emotion and other image content.

The purpose of content-based image retrieval is generally three categories: (1) Accurate query, find consistent copy, (2) Range query, find the image with the input image characteristics of a certain range of images; (3) k-Nearest neighbor query, which sorts the results of the search according to the similarity between the input image and the image to be detected.

The basic principle of Cbir

Cbir the basic principle of formal definition: Any given a retrieval image example p, the calculation of its eigenvector f= (F1,F2,F3, ... FN), wherein FI is the first feature of the image, and according to the Image feature index Library of F, the characteristic vector F ' with the lowest f distance is obtained, and the corresponding image P ' of f ' is the most similar search result as P '. The typical architecture of the CBIR system is shown below.

The system mainly includes the user interface, the retrieval and the storage System three parts, in which the image characteristic index technique and the similarity degree matching technique are the core parts of the system, directly affect the recall and precision of the system retrieval. In which the retrieval and storage systems have to calculate the eigenvector of the original image, the difference is that the index library is generated offline, and the query retrieval part needs real-time online computation. The calculation of the matching degree will determine the result of the search results and the order of the results, so it is of great significance.

An index representation method for image content

1 Low-level image features

Color Features

Color is the most prominent feature of the image, with stability, rotation, translation, scale change independence, and color characteristics of simple calculation, showing a strong robustness, so color-based retrieval becomes the most basic method in the existing image retrieval system.

The similarity matching algorithm of color feature index is different according to its color index content and algorithm, including histogram intersection method, Manhattan distance, Absolute distance (L1), two times distance (L2), Euclidean distance and so on.

Texture Features

Texture refers to the image pixel grayscale set or a certain regularity of color changes, can be considered as a grayscale (color) in space in a certain form of the resulting pattern. The gray scale distribution in general texture image has some periodicity, it has certain statistic characteristic, and it is usually closely related to the high frequency component in the image spectrum. The basic texture features of the image are mainly six aspects: Roughness (coarseness), contrast (contrast), direction (directionality), line Image (Line-likeness), regularity (regularity), and rough degree ( rough-ness), the most important features are texture roughness, contrast, and direction.

Shape Features

Shape is a salient feature of an image, and the shape is usually considered to be an area surrounded by a closed contour curve, and the description of the shape involves a description of the contour boundary and a description of the area surrounding the boundary. The result of the description is an approximation representation of the boundary of the image area.

Multi-Feature comprehensive search

The image retrieval based on color, texture and shape feature has advantages and disadvantages, which reflect some characteristics of image from different angles, and for more complete description of image content, it can improve the accuracy rate of retrieval effectively, and people often use different types of comprehensive features to retrieve images to complement each other. such as comprehensive color and texture features to search, comprehensive color and shape characteristics of the search, comprehensive texture and shape features of the search, comprehensive color and spatial relationship characteristics of the search.

High-level semantic features

Whether based on color, texture, shape retrieval method, or multi-feature comprehensive retrieval method, it belongs to the method of image low-level visual content representation. The image retrieval technology based on color feature takes the image from the computer angle as a discrete cell point, the cell is isolated, can only represent the color consistency that the image presents on the whole, but can not distinguish the internal feature of the image; The retrieval based on texture feature is on the basis of color, and the relationship between neighboring cells is considered. The regularity, roughness, direction degree are proposed to measure the linear feature of the image, and the shape-based retrieval divides the image into a closed area, shielding the detail elements such as the background in the image, and more approximates the cognition of the image.

The actual image is the indirect expression of the cognition of the world, a picture is full of rich semantic information, not only color, texture, shape, in addition, the image is full of entity objects, there is a relationship between objects in space, a picture or a series of images can represent a specific scene and action, Even some images contain the rich emotional and moral meaning of the author.

Object categories and spatial relationships

Image retrieval is always an important research direction of image database retrieval by using the characters of spatial relation between object and object in image, and Tanimoto presents the entity in image by means of primitive method, and puts forward the index of Image object. Then it is adopted by Chang, and proposes a two-dimensional symbolic string (2d-string) representation method to retrieve the spatial relation of image, which is simple and can reconstruct their symbolic graph from 2d-string for some images. Therefore, it is adopted and improved by many people: Jungert the spatial relationship between the objects according to the overlapping relation between the x-axis and the projection interval on the y-axis, respectively, by the minimum bounding box of the image object; Lee and Hsu and others put forward the 2dc-string method , Nabil comprehensive 2d-string method and the topological relation of point set between objects in two-dimensional plane, the 2d-pir retrieval method is proposed.

The understanding of the content of the image rises to the understanding of the object and its spatial relationship, which makes up for the lack of spatial information constraint in the previously mentioned method, and the following is the structure diagram of spatial relation semantic extraction:

Affective semantics

The emotional semantics of image expression, relative to other characteristics, have more subjective components, which involve human cognitive model, cultural background and aesthetic standard.

At present, only in the art image of this particular area of the image of the emotional semantics have a certain degree of research.

In addition to the color, the texture of the density, the inclination of the line, the smoothness of the different expressions of the emotional meaning of the language is also distinct, smooth texture to the person's delicate sense, rough texture to the old sense, hard texture to people with a strong sense. The square is easy to give the person solemn sense, the triangle's sharp angle is apt to produce the aggressive and enterprising feeling, but the circle is easy to produce the relaxed folk movement feeling.

The image retrieval system frame diagram based on emotion is as follows:

Based on the user's emotion, the object of the image retrieval is to realize the matching between the user's retrieval requirement and the image by the subjective experience (i.e. affective semantic feature or perceptual feature) that the image may inspire. The retrieval process includes the receiving and transformation of perceptual questions, the retrieval matching, the feedback of the results, and the related feedback.

Introduction to domestic and international system examples



IBM's Qbic (Query byimage Content) is the first commercial Cbir system. It provides an image indexing method based on color, texture, shape, and freehand sketches. The expression of color features adopts two methods of mean color and color histogram, and the expression of texture features is based on the synthesis of texture roughness, contrast and directivity. At present, content-based retrieval technology of QBIC system has been applied in IBM Digital Library, which realizes automatic indexing, merging, contrast, feature extraction and translation.

Visualseek & Web Seek


Columbia University's visual SEEK provides a method of indexing based on color and texture. In Visual SEEK, the whole image color distribution uses the global color histogram, the region color index uses the binary color set expression method. It uses a wavelet transform-based method to represent the texture characteristics of an image. In order to speed up the retrieval, an index algorithm based on binary tree is also developed. The system has a Java browser that can be run on SGL, SUN, and IBM PC platforms.



Photobook is an interactive tool for searching and browsing images developed by the MIT Media Lab, which consists of three subsystems that extract shapes, textures, and face features, each of which can be retrieved based on one of these features. Since there is not yet a feature that can be well modeled on the image, in the latest version of Photobook Foureyes, Picard and other people are also included in the image labeling and retrieval process, the experimental results show that this method is effective for automatic labeling of images.


The MARS (multimedia analysis and retrieval systems) system, developed by the University of UIUC in the United States, differs in many areas of knowledge: computer vision, database management systems, and information retrieval. The focus of MARS system is not the single best feature representation, but how to organize different visual features into meaningful retrieval system to dynamically adapt to different users and application situations. The MARS system is a system that formally proposes relevant feedback, which integrates the relevant feedback technology into the different levels of the retrieval process.

Challenges and problems faced by Cbir

Comprehensive Search Methods

The characteristics of an image are various, and the index of some features can not be expressed by a quantization value, it is necessary to apply multi-dimensional vectors, resulting in the comprehensive multi-feature retrieval, the feature vector is up to 102 magnitude, much more than the regular database indexing ability, therefore, need to study new index structure and algorithm to effectively support multi-feature , heterogeneous features, weights, primary key features of the query requirements.

Computer vision and Pattern recognition technology

In high-level semantic retrieval based on shape and object and its spatial relation mentioned above, how to recognize each object on the image is the basis of retrieval, which involves the pattern recognition technology in computer vision and artificial intelligence, such as image processing, image understanding, etc., because these technologies are still immature, which leads to the retrieval being embarrassed, Can't go any further.

Web the general retrieval method under the environment

In the network environment, the automatic acquisition of image files and the crawl of ordinary HTML documents are not fundamentally different, the difference is that there are many images in different formats, sizes, types and different fields in the network environment, which determines the complexity of indexing process in Cbir system, and from the perspective of user experience, In the network environment, the response time requirements of users are more demanding. It is a problem that cbir system must solve in the Web environment to find an efficient and universal retrieval method and a retrieval process that conforms to the user's interaction habits.

Lack of objective evaluation criteria

At present, the evaluation method based on content search results adopts the recall and precision of traditional information retrieval field. When people use the system, the retrieval means is very limited, the subjectivity of the cognition of the image content makes it difficult to define an objective criterion, so it is difficult to define a good evaluation method. At the same time, the evaluation of retrieval efficiency will be a problem to be solved in the future research.


Content-based image retrieval is an interdisciplinary subject, and its research involves computer graphics, image processing, image comprehension, pattern recognition, artificial intelligence, neural network and database technology, as well as art, cognition and psychology. The content retrieval based on the visual features of low-level image has achieved some achievements, but the retrieval based on the semantic content is in the experimental stage, and in the network environment, the image retrieval based on the online mode is of great significance and challenge.

Reprint Please specify source:

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Content-based Image retrieval technology

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.