This paper summarizes the previous articles and realizes a small image retrieval application.
A small image retrieval application can be divided into two parts:
- Train, build the feature database of the image set.
- Retrieval, retrieving, given image, returning the most similar image from the image Library
The process for building an image database is as follows:
- Visual glossary for generating image sets (vocabulary)
- Extracting sift characteristics of all images in an image set
- Clustering of the obtained Sifte feature sets, the cluster center is vocabulary
- To re-encode the image in the image set, use bow or Vlad, where you can select Vlad.
- The VLAD representation of all the images in an image set is grouped together to get a Vlad table, which is the database that queries the image.
Once the query data for the image set is obtained, the process of finding the most similar image in the database for any image is as follows:
- Extracting sift features of images
- Load vocabulary, use Vlad to represent images
- Find the most similar vector in the image database with the Vlad
The process of building a feature database for an image set is usually offline, and the process of querying needs to be real-time and the basic process is as follows:
Consists of two parts: offline's training process and online search
Implementation of each function module
The following is the use of Vlad to represent the image, the implementation of a small image database retrieval program. Below to implement the required functional modules
- Feature Point extraction
- Build Vocabulary
- Building a Database
The first step, the extraction of feature points
Whether bow or Vlad, are based on the local features of the image, the local feature selected in this paper is sift, using its extended rootsift. Extracting to a stable feature point is especially important, this article uses the OpenCV body oh that, the instantiation of the SiftDetecotr
following:
auto fdetector = xfeatures2d::SIFT::create(0,3,0.2,10);
create
The statement is as follows:
static Ptr<SIFT> cv::xfeatures2d::SIFT::create ( int nfeatures = 0, int nOctaveLayers = 3, double contrastThreshold = 0.04, double edgeThreshold = 10, double sigma = 1.6 )
- Nfeatures sets the number of feature points extracted to, and each SIFT feature point calculates a score based on its contrast (local contrast). When this value is set, it is sorted by fractions, leaving only the first Nfeatures returned
- Noctavelayers the number of layers in each octave, the value can be calculated based on the resolution size of the image. This value is 3 in D.lowe paper
- Contrastthreshold filter out the unstable feature point with low contrast, the larger the value, the less feature points are extracted
- Edgethreshold the feature point at the edge of the filter, the larger the value, the more feature points are extracted
- The parameters of the Sigma Gaussian filter, which is applied to a No. 0 octave
Some personal insights.
When setting parameters, it is mostly set contrastThreshold
and edgeThreshold
. contrastThreshold
is to filter out some of the unstable feature points of the smooth region, which is the unstable edgeThreshold
key point of the similar edge. When setting parameters, you should try to ensure that the number of features extracted is moderate, not too much, and not too little. In addition, the contrastThreshold
edgeThreshold
balance should be based on whether the target to be extracted is a more smooth area or a more textured area to balance the settings of these two parameters.
For some images, the parameters that may be set to extract feature points are called strict, and the number of feature points is too small to change the relaxed parameters.
auto fdetector = xfeatures2d::SIFT::create(0,3,0.2,10);fdetector->detectAndCompute(img,noArray(),kpts,feature);if(kpts.size() < 10){ fdetector = xfeatures2d::SIFT::create(); fdetector->detectAndCompute(img,noArray(),kpts,feature);}
The threshold value of 10 can be adjusted according to the specific situation.
For more information about SIFT, see the article:
- Image retrieval (1): sift-based on the vlfeat implementation of the use of lightweight visual library Vlfeat extraction sift features, the extracted features feel more stable, but the use of the OPENCV is not as convenient.
- Sift features
For Rootsift and VLAD you can refer to the previous article Image retrieval (4): If-idf,rootsift,vlad.
Step two, build vocabulary
The construction process of vocabulary is actually the clustering of the extracted image feature points. First, the Image Library image Sift feature is extracted and extended to Rootsift, and then the extracted rootsift are clustered to get vocabulary.
Here class Vocabulary
, the main following methods are created:
create
Building clusters from extracted feature points to get a visual glossaryVocabulary
void Vocabulary::create(const std::vector<cv::Mat> &features,int k){ Mat f; vconcat(features,f); vector<int> labels; kmeans(f,k,labels,TermCriteria(TermCriteria::COUNT + TermCriteria::EPS,100,0.01),3,cv::KMEANS_PP_CENTERS,m_voc); m_k = k;}
load
and save
, for ease of use, you need to be able to save the generated visual glossary to the Vocabulary
ask file (. yml)
tranform_vlad
, the input image is converted to a Vlad representation
void Vocabulary::transform_vlad (const cv::mat &f,cv::mat &vlad) {//Find the nearest center PTR&L T flannbasedmatcher> Matcher = Flannbasedmatcher::create (); Vector<dmatch> matches; Matcher->match (f,m_voc,matches); Compute Vlad Mat responsehist (m_voc.rows,f.cols,cv_32fc1,scalar::all (0)); for (size_t i = 0; i < matches.size (); i++) {Auto Queryidx = Matches[i].queryidx; int trainidx = MATCHES[I].TRAINIDX; Cluster index Mat residual; Subtract (F.row (QUERYIDX), M_voc.row (TRAINIDX), Residual,noarray ()); Add (Responsehist.row (TRAINIDX), Residual,responsehist.row (TRAINIDX), Noarray (), Responsehist.type ()); }//L2-norm Auto L2 = norm (RESPONSEHIST,NORM_L2); Responsehist/= L2; Normalize (RESPONSEHIST,RESPONSEHIST,1,0,NORM_L2); Mat VEC (1,m_voc.rows * f.cols,cv_32fc1,scalar::all (0)); Vlad = Responsehist.reshape (0,1); Reshape the matrix to 1 x (k*d) vector}
class Vocabulary
The following methods are available:
- Build a visual glossary from a list of images
Vocabulary
- Saves the build
Vocabulary
to local, and provides a load
method
- To represent an image as a Vlad
Step three, create the image database
An image database is a collection of Vlad representations of an image that, when retrieved, returns an image corresponding to a VLAD similar to the query image.
In this paper, we use OPENCV to Mat
construct a simple database, which Mat
holds the matrix of Vlad vectors of all images, which is the actual retrieval when retrieving Mat
.
Declares class Database
a class that has the following features:
add
Adding images to a database
save
and load
Save the database as a file (. yml)
retrieval
Retrieves, creates an index on the saved Vald vector Mat
, and returns the most similar results.
Fourth step, Trainer
In the above, it realizes the extraction of feature points, constructs the visual glossary, constructs the database which the image represents as Vlad, combines it together, creates the Trainer
class, facilitates the training use.
class Trainer{public: Trainer(); ~Trainer(); Trainer(int k,int pcaDim,const std::string &imageFolder, const std::string &path,const std::string &identifiery,std::shared_ptr<RootSiftDetector> detector); void createVocabulary(); void createDb(); void save();private: int m_k; // The size of vocabulary int m_pcaDimension; // The retrain dimensions after pca Vocabulary* m_voc; Database* m_db;private: /* Image folder */ std::string m_imageFolder; /* training result identifier,the name suffix of vocabulary and database voc-identifier.yml,db-identifier.yml */ std::string m_identifier; /* The location of training result */ std::string m_resultPath;};
Use Trainer
required Configuration
- The directory where the image set resides
- The size of the visual glossary (number of cluster centers)
- Vlad retained dimensions after PCA can be set to 0 regardless of PCA
- The path to save data after training. After the training data is saved in the
yml
form, the naming rules are voc-m_identifier.yml
and db-m_identifier.yml
. To facilitate testing of data with different parameters, a suffix parameter is set here m_identifier
to distinguish the training data of different parameters.
It uses the following code:
int main(int argc, char *argv[]){ const string image_200 = "/home/test/images-1"; const string image_6k = "/home/test/images/sync_down_1"; auto detector = make_shared<RootSiftDetector>(5,5,10); Trainer trainer(64,0,image_200,"/home/test/projects/imageRetrievalService/build","test-200-vl-64",detector); trainer.createVocabulary(); trainer.createDb(); trainer.save(); return 0;}
Lazy, not configured as a parameter, the use of the need to set the path of the image, as well as the data after the training to save data.
Fifth Step, Searcher
In Database
, the method has been implemented retrieval
. The reason here is to encapsulate a layer is to better fit some of the business needs. For example, the image of some preprocessing, chunking, multithreading, query results of filtering and so on. With regard to Searcher
the specific application of the coupling is deep, here is simply the implementation of a retrieval
method and the configuration of query parameters.
class Searcher{public: Searcher(); ~Searcher(); void init(int keyPointThreshold); void setDatabase(std::shared_ptr<Database> db); void retrieval(cv::Mat &query,const std::string &group,std::string &md5,double &score); void retrieval(std::vector<char> bins,const std::string &group,std::string &md5,double &score);private: int m_keyPointThreshold; std::shared_ptr<Database> m_db;};
It is also very simple to use, load Vaocabulary
and set parameters from a file Database
Searcher
.
Vocabulary voc; stringstream ss; ss << path << "/voc-" << identifier << ".yml"; cout << "Load vocabulary from " << ss.str() << endl; voc.load(ss.str()); cout << "Load vocabulary successful." << endl; auto detector = make_shared<RootSiftDetector>(5,0.2,10); auto db = make_shared<Database>(detector); cout << "Load database from " << path << "/db-" << identifier << ".yml" << endl; db->load1(path,identifier); db->setVocabulary(voc); cout << "Load database successful." << endl; Searcher s; s.init(10); s.setDatabase(db);
Summary
To summarize the whole process.
- Create
Vocabulary
- Create
Database
- Search similary List
Image retrieval (5): Implementation of small image database retrieval based on OPENCV