Original article: http://blog.csdn.net/mysniper11/article/details/8726649
Video introduction URL: http://www.cvchina.info/2011/04/05/tracking-learning-detection/
TLD (tracking-Learning-detection) is a Czech PhD student Zdenek kalal proposed a new long term tracking (Long Term tracking) during his PhD degree at the University of surari) tracking Algorithm. This algorithm is significantly different from the traditional tracking algorithm in that it combines the traditional tracking algorithm with the traditional detection algorithm to solve the deformation and partial occlusion of the tracked object during the tracking process.. At the same time, through an improved online learning mechanism, the "significant feature points" of the tracking module and the target model and related parameters of the detection module are constantly updated, this makes the tracking effect more stable, robust, and reliable.
For long tracking, a key problem is that when the target re-appears in the camera's field of view, the system should be able to re-detect it and start re-tracking. However, during a long tracking process, shape changes, illumination conditions changes, scale changes, and occlusion are inevitable for the target to be tracked. In traditional tracking algorithms, the front-end needs to work with the detection module. After detecting the target to be tracked, it starts to enter the tracking module. After that, the detection module will not be involved in the tracking process. However, this method has a fatal defect: When the tracked target has shape changes or occlusion, the tracking will easily fail; therefore, for long-time tracking, or when the target to be tracked has a shape change, many people use the detection method instead of tracking. Although this method can improve the tracking effect in some cases, it requires an offline learning process. That is, before detection, You need to select a large number of samples of the target to be tracked for learning and training. This means that the training sample should cover the various deformation and various scales, pose changes and illumination changes that may occur to the tracked target. In other words, the detection method is used to achieve long-time tracking, which is crucial to the selection of training samples. Otherwise, the tracking robustness is hard to guarantee.
We recommend that you visit the home page of a foreign author: http://info.ee.surrey.ac.uk/Personal/Z.Kalal/
Download the source code on the author's website and download some of his useful papers:
Source code for Matlab and C Mixed Programming, has been the C ++ version of the source code share: http://gnebehay.github.com/OpenTLD/
The source code on GitHub can also be used: https://github.com/arthurv/OpenTLD
Useful blogs on csdn include:
(1) Ding Jie Niu TLD series:
Http://blog.csdn.net/yang_xian521/article/details/7091587
(2) next talk about PN learning:
Http://blog.csdn.net/carson2005/article/details/7647519
(3) Better Video Tracking Algorithms than Microsoft's Kinect-Introduction to TLD Tracking Algorithms
Http://blog.csdn.net/carson2005/article/details/7647500
(4) Analysis of TLD Visual Tracking Technology
Http://www.asmag.com.cn/number/n-50168.shtml
(5) TLD (tracking-Learning-detection) learning and source code understanding (I)
Http://blog.csdn.net/zouxy09/article/details/7893011
I hope these materials will be helpful to those who want to learn TLD algorithms.
From: http://blog.csdn.net/windtalkersm/article/details/8018980
TLD is short for an algorithm. The original author called it tracking-Learning-detection. People looking at the video will be shocked to see the name, which is a great ambitious plan. It's a 09-year job, not too long, but not too new. There are actually many resources on the Internet, which are largely related to the author's open source code.
The first problem encountered during the learning process is that there are too many resources-of course, it is better to find the source code of a faithful reproduction algorithm in this field. So it is a waste of time to find the list. I hope it will be helpful to others. The specific details are not described in detail. There are a lot of great analyses listed below, such as the source code comments written by zouxy09, which cannot be further detailed. If you are looking for something, it's just that the text in the big section is dizzy and doesn't make any layout. I would like to draw a few simple pictures to add, I do not know what a good drawing program recommendation (latex, or gnuplot? Never used)
Source code resources:
1. Original Author Zdenek kalal
Author Home: http://info.ee.surrey.ac.uk/Personal/Z.Kalal/
Source code: https://github.com/zk00006/OpenTLD
Programming Language: MATLAB + c
2. Alan Torres
Source code: https://github.com/alantrrs/OpenTLD
Implementation language: C ++
3. arthurv
Source code: https://github.com/arthurv/OpenTLD
Implementation language: C ++
Note: There is no difference with the above
4. jmfs
Source code: https://github.com/jmfs/OpenTLD
Implementation language: C ++
Note: There is no difference with the above two, except that the vs2010 project file is added. Theoretically, it can be compiled directly in windows. However, opencv cannot detect the author's webcam (!!!), So he used another videoinput class to input the handle camera.
This is an adaptation of arthurv‘s fork of OpenTLD (https://github.com/arthurv/OpenTLD) to be immeadiately runnable in Visual Studio 2010.
5. Georg nebehay version (Finally there is a different one ....)
Source code: http://gnebehay.github.com/OpenTLD/
Note 1: This provides executable file downloads (Ubuntu 10.04 and windows ). But, as you wowould would just perform CT, basically not on your machine. Let's build it on your own.
NOTE 2: QT must be installed in this version. However, it seems that the author has disabled the QT option (the relevant code is still in progress), so it can be compiled, but the result cannot be displayed.
Note 3: The csdn download contains the "opentld QT" version. However, the vs project file is added. It cannot be PNP or don't bother on my machine.
Http://download.csdn.net/download/muzi198783/4111915
6. Paul Nader version (another QT version !)
Qopentld: http://qopentld.sourceforge.net/
Source code: http://sourceforge.net/projects/qopentld/
Note 1: opencv and QT are required. The original system requires QT 4.3.7opencv 2.2.
NOTE 2: both Windows and Linux provide the compilation project or makefile. It is estimated that it is the only TLD transplanted to the Android platform!
7. Ben pryke (another student project !)
Source code: https://github.com/Ninjakannon/BPTLD
Note: It is still a Hybrid Implementation of MATLAB + C/C ++. The highlight is the detailed documentation (8 pages), which introduces the understanding and implementation details of the algorithm. Can help understand the original algorithm
What you want to say:
1. Sharing: some time ago I had read the TLD: Init (...). I wanted to get angry, so I had to let go of other things. However, I am familiar with detection and tracking, and I have already done learning in init, so it is easy to understand the rest. Now I picked it up again and accidentally found the zouxy09 comment, saving me a lot of effort and finishing reading it in half a day. Many details do not need to be renewed by ourselves-we often complain that there are too few documents for this resource, So we envy foreigners for their ability to work fast and have a great deal to share with them. I often see some good articles in my favorites. I deleted them in a few days!
It is understandable that the company should be kept confidential, but do not do it in this field if you are afraid that others will understand your own ideas. Algorithms are just ideas, and no one can monopolize them. Algorithms must also be constantly updated, and they will not survive for a few years without being put into practice. The original author also started a company based on this technology and did not see them using this restriction. Since sift and surf are patent, we haven't heard of making a lot of money, and the Kinect tells you that the algorithm cannot be implemented either. What is to be kept confidential is the Implementation Details
2. Comparison: After finally reading the implementation, the general feeling is that this algorithm is more like engineering rather than theoretical breakthrough (and it cannot be too many ). This combination is not necessarily better than a single tracking or detection module. After all, it still does not solve the appearance (appearence) and scale changes. However, this framework should be very practical in practice, because ---------- there are too many adjustable parameters!
TLD is believed to have been tried by many people, and many people complain about the real-time performance. In addition, some parameters must be tuned to their videos.
In comparison, he prefers the compressive tracking of Kaihua Zhang on eccv this year: the theory is profound (joke), and the source code is simple and scary. It is also the best tracking effect in the off-the-shelf tracker that I have tried so far. No parameter is required. It is absolutely real-time-there are so few codes, it's hard to think about it in real time (by the way, the author's blog mentioned above ). This is the method of research, with a strong theoretical support, the implementation can be very simple but will not affect the effect. Therefore, many people will be laid off if they are willing to apply mathematics.
Http://www4.comp.polyu.edu.hk /~ Cslzhang/ct/ct.htm
Another PWP (pixel-wise posteriors), publish time is similar to TLD, and the performance looks pretty good, but the author says it is open-source and has never been fulfilled. It is a pity. I personally think that level set should have a good effect on partial occlusion, and it is not difficult to achieve real-time shielding.
Http://www.robots.ox.ac.uk /~ Cbibby/research_pwp.shtml
3. Conclusion: TLD is actually a very suitable entry and advanced algorithm:
A. Theoretical and high-quality paper (BMVC, cvpr, ICPR, and PAMI)
B. source code is available! MATLAB, C ++, windows, Linux,... what else do you want?
C. Have detailed introductions and code comments shared by different scalpers (almost every line has been explained )!
4. It involves a wide range of features, including detection, tracking, and classifcation. Traditional Visual technologies are classified into three categories. I have learned a little about each part after the study.
The end
A Target Tracking Algorithm (TLD)