Http://rogerioferis.com/VisualRecognitionAndSearch/Resources.html
Source code
Non-exhaustive list of state-of-the-art implementations related to visual recognition and search. There is no warranty for the source code links below-use them at your own risk!
Feature Detection and description
General Libraries:
- Vlfeat-implementation of various feature Descriptors (including sift, hog, and HSV) and covariant feature detectors (including dog, Hessian, Harris Laplace,
Hessian Laplace, multiscale Hessian, multiscale Harris). Easy-to-use MATLAB interface. seemodern
Features: Software-slides providing a demonstration of vlfeat and also links to other software. Check alsovlfeat hands-on session training
- Opencv-various implementations of modern feature detectors and Descriptors (SIFT, surf, fast, brief, ORB, freak, etc .)
Fast keypoint detectors for real-time applications:
- Fast-high-speed corner detector implementation for a wide variety of platforms
- Agast-even faster than the fast corner detector. A multi-scale version of this method is used for the brisk Descriptor (eccv 2010 ).
Binary descriptors for real-time applications:
- Brief-C ++ code for a fast and accurate interest point Descriptor (not invariant to rotations and scale) (eccv 2010)
- Orb-opencv implementation of the oriented-Brief (ORB) Descriptor (invariant to rotations,
But not scale)
- Brisk-efficient binary descriptor invariant to rotations and scale. It uses des a MATLAB Mex interface. (iccv 2011)
- Freak-faster than brisk (invariant to rotations and scale) (cvpr 2012)
Sift and surf implementations:
- Sift:
Vlfeat,
Opencv,
Original code by David Lowe,
GPU implementation,
Opensift
- Surf:
Herbert Bay's code,
Opencv,
GPU-SURF
Other local feature detectors and Descriptors:
- Vgg affine covariant features-Oxford code for various affine covariant feature detectors and descriptors.
- Liop descriptor-source code for the local intensity order pattern (Liop) Descriptor (iccv 2011 ).
- Local variables ry features-source code for matching of local variables ry features under large variations in lighting, age, and rendering
Style (cvpr 2012 ).
Global Image descriptors:
- GIST-Matlab code for the gist Descriptor
- Centrist-global visual descriptor for scene categorization and object detection (PAMI 2011)
Feature coding and pooling
- Vgg feature encoding toolkit-source code for various state-of-the-art feature encoding methods-including standard
Hard encoding, kernel codebook encoding, locality-Constrained Linear Encoding, and Fisher kernel encoding.
- Spatial pyramid matching-source code for feature pooling Based on Spatial pyramid matching (widely used for image classification)
Convolutional nets and deep learning
- Eblearn-C ++ library for energy-based learning. It uses des several demos and step-by-step instructions to train classifiers based on convolutional
Neural Networks.
- Torch7-provides a MATLAB-like environment for state-of-the-art machine learning algorithms, including a fast implementation of Convolutional neural networks.
- Deep Learning-various links for deep learning software.
Part-based models
- Deformable part-based detector-library provided by the authors of the original paper (state-of-the-art in Pascal VOC detection task)
- Efficient deformable part-based detector-branch-and-bound implementation for a deformable part-based detector.
- Accelerated deformable part model-efficient implementation of a method that achieves the exact same performance of deformable part-based
Detectors but with significant acceleration (eccv 2012 ).
- Coarse-to-fine deformable part model-Fast Approach for Deformable object detection (cvpr 2011 ).
- Poselets-C ++ and Matlab versions for Object Detection Based on poselets.
- Part-based face detector and pose estimation-implementation of a unified approach for face detection, pose estimation, and landmark Localization
(Cvpr 2012 ).
Attributes and Semantic Features
- Relative attributes-modified Implementation of ranksvm to train relative attributes (iccv 2011 ).
- Object bank-Implementation of object bank semantic features (NIPS 2010). See alsoactionbank
- Classemes, picodes, and meta-class features-software for Extracting High-Level Image Descriptors (eccv
2010, Nips 2011, cvpr 2012 ).
Large-scale Learning
- Additive kernels-source code for fast additive kernel SVM classifiers (PAMI 2013 ).
- Liblinear-library for large-scale linear SVM classification.
- Vlfeat-implementation for pegasos SVM and homogeneous kernel map.
Fast indexing and Image Retrieval
- FLANN-library for faster Ming fast approximate nearest neighbor.
- Kernelized lsh-source code for kernelized locality-sensitive hashing (iccv 2009 ).
- ITQ binary codes-code for generation of small binary codes using iterative quantization and other baselines such as locality-sensitive-hashing
(Cvpr 2011 ).
- INRIA image retrieval-efficient code for state-of-the-art large-scale image retrieval (cvpr 2011 ).
Object Detection
- See
Part-based models and
Convolutional nets above.
- Pedestrian detection at 100fps-very fast and accurate pedestrian detector (cvpr 2012 ).
- Caltech Pedestrian detection benchmark-excellent resource for pedestrian detection, with varous links for state-of-the-art
Implementations.
- Opencv-enhanced implementation of Viola & Jones real-time object detector,
With trained models for face detection.
- Efficient subwindow search-source code for branch-and-bound Optimization for efficient object localization (cvpr 2008 ).
3D Recognition
- Point-cloud library-library for 3D image and point cloud processing.
Action Recognition
- Actionbank-source code for Action Recognition Based on the actionbank representation (cvpr 2012 ).
- STIP features-software for computing space-time interest point Descriptors
- Independent subspace analysis-look for stacked Isa for videos (cvpr 2011)
- Velocity histories of Tracked keypoints-C ++ code for Activity recognition using the velocity histories of Tracked keypoints (iccv
2009)
Datasets
Attributes
- Animals with attributes-30,475 images of 50 animals classes with 6 Pre-extracted feature representations for each image.
- Ayahoo and apascal-attribute annotations for images collected from Yahoo and Pascal VOC 2008.
- Facetracer-15,000 faces annotated with 10 attributes and fiducial points.
- Pubfig-58,797 face images of 200 people with 73 attribute classifier outputs.
- Lfw-13,233 face images of 5,749 people with 73 attribute classifier outputs.
- Human attributes-8,000 people with annotated attributes. Check also thislink
For another dataset of human attributes.
- Sun Attribute Database-large-scale scene attribute database with a taxonomy of 102 attributes.
- Imagenet attributes-variety of attribute labels for the imagenet dataset.
- Relative attributes-data for OSR and a subset of pubfig datasets. Check also thislink
For the whittlesearch data.
- Attribute discovery dataset-images of shopping categories associated with textual descriptions.
Fine-grained visual Categorization
- Caltech-UCSD birds dataset-hundreds of bird categories with annotated parts and attributes.
- Stanford dogs dataset-20,000 images of 120 breeds of dogs from around the world.
- Oxford-iiit pet dataset-37 category pet dataset with roughly 200 images for each class. pixel level trimap segmentation is supported ded.
- Leeds butterfly dataset-832 images of 10 species of butterflies.
- Oxford flower dataset-hundreds of flower categories.
Face Detection
- Fddb-UMass Face Detection dataset and benchmark (5,000 + faces)
- CMU/MIT-classical Face Detection dataset.
Face Recognition
- Face recognition homepage-large collection of face recognition datasets.
- Lfw-UMass unconstrained face recognition dataset (13,000 + face images ).
- NIST face homepage-includes face recognition Grand Challenge (frgc), Vendor tests (frvt) and others.
- CMU multi-pie-contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
- Feret-classical face recognition dataset.
- Deng Cai's face dataset in MATLAB format-Easy to use if you want play with simple face datasets including Yale, orl,
Pie, and extended Yale B.
- Scface-low-resolution face dataset captured from surveillance cameras.
Handwritten Digits
- Mnist-large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.
Pedestrian detection
- Caltech Pedestrian detection benchmark-10 hours of video taken from a vehicle, 350 K bounding boxes for about 2.3 K
Unique pedestrians.
- INRIA person dataset-currently one of the most popular Pedestrian detection datasets.
- ETH pedestrian dataset-Urban dataset captured from a stereo rig mounted on a stroller.
- Tud-Brussels pedestrian dataset-dataset with image pairs recorded in an crowded urban setting with an onboard camera.
- Pascal human detection-one of 20 categories in Pascal VOC detection challenges.
- USC pedestrian dataset-small dataset captured from surveillance cameras.
Generic Object Recognition
- Imagenet-currently the largest visual recognition dataset in terms of number of categories and images.
- Tiny images-80 million 32x32 low resolution images.
- Pascal VOC-one of the most influential visual recognition datasets.
- Caltech 101/Caltech
256-popular image datasets containing 101 and 256 object categories, respectively.
- MIT labelme-Online annotation tool for building computer vision databases.
Scene recognition
- MIT sun dataset-mit scene understanding dataset.
- Uiuc shortteen scene categories-dataset of 15 natural scene categories.
Feature Detection and description
- Vgg affine dataset-widely used dataset for measuring performance of Feature Detection and description. checkvlbenchmarks
For an evaluation framework.
Action Recognition
- Benchmarking Activity recognition-cvpr 2012 tutorial covering various datasets for action
Recognition.
Rgbd Recognition
- RGB-D object dataset-dataset containing 300 common household objects
Related courses
- Visual recognition-Kristen Grauman, U. Texas, fall 2012.
- The cutting edge of Computer Vision-fei Li, Stanford, spring 2011.
- Learning-based methods in vision-Alyosha Efros and Leonid Sigal, CMU, spring 2012.
- Grounding object recognition and scene understanding-Antonio Torralba, MIT, fall 2011.