Bag of Features (BOF) Image retrieval algorithm

Source: Internet
Author: User
Tags scale image svm idf

1. First. We use the surf algorithm to generate the feature points and descriptive descriptors of each picture in the image library.


2. The K-means algorithm is used to train the feature points in the image library to generate the class heart.


3. Generate BOF for each image. The detailed method is: Infer each feature point of the image with which class heart is recent. In the near future, a series of frequency tables will be generated. That is, the initial right to BOF.
4. Add weights to the frequency tables by TF-IDF to generate the finally BOF. (Because each kind of heart has a different effect on the image.) For example, the first digit in a supermarket bar code is always 6, and it has no effect on distinguishing products. Therefore, the right is important to decrease).
5. A 3.4-step operation is performed on the image that the query comes in, generating a list of BOF for the query graph.
6. The BOF vector of query and the BOF vector of each picture in the image library are angled, the matching object is the smallest angle.


Application of LSH in image retrieval for high-speed search. Under the guarantee of certain probability, it overcomes the problem of high-dimensional feature query, but the author uses lsh combined with SIFT feature to practice the image retrieval experiment, because each image involves hundreds of features, then when querying a picture, it is necessary to carry on the characteristics of the query, even if the feature points of the query picture are filtered to 50% of the amount, The number of feature queries required for a picture query is also not a small glimpse. So is there a way to make all the eigenvectors of a random image with a vector of a fixed dimension, and this dimension does not vary with the number of feature points of the picture? The method to be addressed in this article solves the problem, although it is not born of this problem.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center "alt=" here to write a picture descriptive narrative "title=" ">
The Bag-of-words model originates from the text categorization technique, in which it assumes that for a text, it ignores its word order and grammar and syntax.

Think of it only as a collection of words, or as a combination of words. The appearance of each word in the text is independent, does not depend on whether the other words appear, or the author of this article chooses the words in a random position without being influenced by the preceding sentences.



Images can be regarded as a document object, in which different local regions or their features may be regarded as the words that make up an image, in which a similar area or its characteristics can be regarded as a word. In this way, the method of text retrieval and classification can be used in image classification and retrieval.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center "alt=" here to write a picture descriptive narrative "title=" ">
Accelerating bag-of-features SIFT algorithm for 3D Model retrieval
Bag-of-features model is modeled as a Bag-of-words method in the field of text retrieval, describing each image description as an unordered set of features of a local area/key point (Patches/key Points). Clustering of local features using a clustering algorithm, such as K-means. Each cluster center is considered to be a visual term in the dictionary (Visual Word). Equivalent to a word in a text search. The visual vocabulary is represented by code word (which can be seen as a feature quantization process), which is formed by the corresponding characteristics of the cluster center. All visual words Form a visual dictionary (visual Vocabulary), corresponding to a code book, a collection of code words, the number of words contained in the dictionary reflects the size of the dictionary. Each feature in the image is mapped to a word in the visual lexicon, which can be achieved by calculating the distance between features, and then counting the occurrences or occurrences of each visual term. The image can be described as a histogram vector of the same number of dimensions, i.e. bag-of-features.


watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center ">http://img.blog.csdn.net/20131002212031828?watermark/2/text/ ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/ Center
Bag of Features Codebook Generation by self-organisation
Many other bag-of-features are used in image classification or object recognition. In this way, the Bag-of-features feature is extracted from the training set under the strategy of some supervised learning (such as: SVM). The Bag-of-features eigenvector of the training set is trained to obtain the classification model of the object or scene. For the image to be measured, the local feature is extracted and the feature distance of each code word in the dictionary is calculated. Select a code word for the recent distance to represent the feature. Create a statistical histogram. The statistic belongs to each code word characteristic number, namely is the bag-of-features characteristic of the image to be measured. Under the classification model, the characteristics are predicted from the realization of the classification of the measured image.
Classification Process
1, local feature extraction: by cutting, dense or random collection, key points or stable regions, and other ways to make the image of different patches. and obtain the characteristics of each patches place.



Among them, the sift characteristic is more popular.

2. Build a visual Dictionary:

The visual lexicon represented by the cluster Center forms a visual lexicon:

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center "alt=" here to write a picture descriptive narrative "title=" ">
3, generate code book. That is, the structural bag-of-features feature, also known as the local feature projection process:

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvy2hszwxlmdewnq==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center "alt=" here to write a picture descriptive narrative "title=" ">
4, SVM training BOF characteristics of the classification model to treat the image of the BOF features pre-measured:

Retrieval Process
Application of Bag-of-words in CV firstly, in order to solve the search of video scene in Andrew Zisserman[6], the method of using Bag-of-words key point projection is presented to represent the image information.

Perhaps many other researchers have attributed this method to Bag-of-features. and used for image classification, target recognition and image retrieval.

On the basis of the Bag-of-features method, Andrew Zisserman further draws on the TF-IDF model in text retrieval (term Frequency inverse Document Frequency) To calculate the bag-of-features feature vector.

Then we can use the reverse indexing technology in the text search engine to index the image and efficiently carry out the image retrieval.



Hamming embedding and weak geometric consistency for large scale image search
The process of retrieving is not fundamentally different from the process of classification, many others are thin Differences on section Processing:
1, local feature extraction.
2. Build a visual dictionary.
3, generate the original BOF feature;
4, introduce TF-IDF weight:
TF-IDF is a kind of weighted technique used in information retrieval, which is in text retrieval. Used to evaluate the importance of words for one of the documents in a document database. The importance of words is added in proportion to the frequency with which it appears in the file, but at the same time it decreases inversely as it appears in the file database. The main idea of TF is to assume that a keyword appears in a post in a high frequency. Indicates that the word can characterize the content of the article. This keyword is very rare in other articles, it is found that the word has a very good classification of categories, has a very large contribution to categorization. The main idea of IDF is to assume that fewer files include word A in the file database. The larger the IDF, the better the class-distinguishing ability.
Word frequency (term Frequency. TF) refers to the number of occurrences of a given word in the file. For example: TF = 0.030 (3/100) indicates that the word ' A ' appears 3 times in a document that includes 100 words. The
Inverse document frequency (inverse documents Frequency. IDF) is a descriptive description of the universal importance of a particular term. Suppose a word has appeared in many documents, indicating that it is not strong enough to differentiate the document, giving it a smaller weight, and vice versa. For example: IDF = 13.287 (log (10,000,000/1,000)) indicates that 1,000 of the total 10,000,000 documents include the word ' A '. The
finally TF-IDF weight is the product of the frequency of the word and the inverse document.
5, the query image generated the same with the right BOF characteristics;
6, query: The first is measured by the cosine distance, as for the method of indexing has not been learned, hope spectators pointing.


Issues
1, using K-means clustering. In addition to the problem of its K and initial cluster center selection. For large amounts of data, the huge input matrix will make memory overflow and inefficient. There is a way to extract some training set classification in the massive picture, using naive Bayesian classification method to classify the remaining images in the gallery. In addition, because the picture crawler is constantly updating the background image set. The cost of another cluster is obvious.
2, the choice of dictionary size is also a problem, the dictionary is too large, the word is not general, noise-sensitive, the computational capacity is large, the key is the image projection of the dimension of High; The dictionary is too small. The word distinction performance is poor and cannot be expressed for similar target characteristics.


3. The similarity measurement function is used to classify the image features into the corresponding words of the word book, which involves the linear kernel. Landslide distance measured nucleus. Histogram cross-core selection.
4, the image is represented as a disorderly local special collection of feature package method, lost all the information about the spatial characteristics of the layout. There is a certain limitation in descriptive narrative. For this SCHMID[2] proposed a bag-of-features based on the space pyramid.


5, Jégou[7] proposed Vlad (vector of locally aggregated descriptors), the method is like a BOF first to establish a codebook containing K visual word. Instead of BOF, a local descriptor is categorized into the recent visual word using NN. Vlad used to calculate the local descriptor and each visual word (c-i) on each component of the gap, will each component of the gap to form a new Shangshilai representative picture.
Resources
Bag-of-words classifiers (Matlab)
Bag of Words/bag of features MATLAB source code
One with bow| Pyramid BOW+SVM matlab Demo for image classification

Bag of Features (BOF) Image retrieval algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.