Content-based image retrieval, the problem of searching large image repositories according to their Content, have been t He subject of a significant amount of computer vision in the recent past. While early retrieval architectures were based on the Query-by-example paradigm, which formulates image retrieval as the S Earch for the best database match to a user-provided query image, it is quickly realized that the design of fully functio NAL retrieval systems would require support for semantic queries. These is systems where the database of images is annotated with semantic keywords, enabling the user to specify the Quer Y through a natural language description of the visual concepts of interest. This realization, combined with the cost of manual image labeling, generated significant interest in the problem of Automa Tically extracting semantic descriptors from images. The earliest efforts in the area were directed to the reliable extraction of specific semantics, e.g. differentiating I Ndoor from outdoor scenes, cities from landscapes, and detecting trees, horses, or buildings, among others. These efforts posed the problem of semantics extraction as one ofsupervised learning:a set of training images with and WI Thout The concept of interest is collected and a binary classifier trained to detect the concept of interest. The classifier was and then applied to all database of images which were, in this "," annotated with respect to the presence Or absence of the concept. More recently, there have been an effort to solve the problem in it full generality, by resorting tounsupervised Learni Ng. The basic idea was to introduce a set of latent variables this encode hidden states of the world, where each state defines A joint distribution on the space of semantic keywords and image appearance descriptors (in the form of local features COM puted over image neighborhoods). After the annotation model was learned, an image was annotated by finding the most likely keywords given the features of the Image. Both formulations of the semantic labeling problem have strong advantages and disadvantages. In generic terms, unsupervised labeling leads to significantly more scalable (in database size and number of concepts of I nterest) training procedures, places much weaker demands on the quality of the manual annotations required to bootstrap le Arning, and produces a natural ranking of keywords for each new image to annotate. On the other hand, it does isn't explicitly treat semantics as image classes and, therefore, provides little guarantees that The semantic annotations is optimal in a recognition or retrieval sense. That's, instead of annotations that achieve the smallest probability of retrieval error, it simply produces the ones There are largest joint likelihood under the assumed mixture model. In this work we show that it's possible to combine the advantages of the formulations through a slight reformulati On the supervised one. This consists of defining a m-ary classification problem where each of the semantic concepts of interest defines an image Class. At annotation time, these classes all directly compete for the image to annotate, which no longer faces a sequence of Inde Pendent binary tests. this supervised multiclass labeling (SML) obviously retains the classification and retrieval optimality of the S upervised formulation, but 1) produces a natural ordering of keywords in annotation time, and 2) eliminates the need to co Mpute a "non-class" model for each of the semantic concepts of interest. In result, it had learning complexity equivalent to that of the unsupervised formulation and, like the latter, places much Weaker requirements on the quality of manual labels than supervised OVA. |
Publications: |
supervised learning of Semantic Classes for Image Annotation and retrieval G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos IEEE Transactions on Pattern analysis and Machine Intelligence, Vol. 3, pp. 394-410, March 2006. Ieee,[pdf]
formulating Semantic Image Annotation as a supervised learning problem G. Carneiro and N. Vasconcelos Proceedings of IEEE Conference on computer Vision and Pattern recognition, San Diego, 2005. Ieee,[ps][pdf]
A Database centric View of Semantic Image Annotation and retrieval G. Carneiro and N. Vasconcelos, Proceedings of ACM Conference on the Development in Information Retrieval (ACM SIGIR) Salvador, Brazil. 2005. [Ps][pdf]
Using Statistics to Search and annotate Pictures:an Evaluation of Semantic Image Annotation and retrieval on Large Da Tabases A. B. Chan, P. J. Moreno, and N. Vasconcelos Proceedings of Joint Statistical Meetings (JSM), Seattle, 2006. [PS] [PDF]
formulating Semantic Image Annotation as a supervised learning problem G. Carneiro and N. Vasconcelos, Technical report svcl-tr-2004-03, December 2004. [PS] [PDF]
|