A better Video Tracking Algorithm than Microsoft Kinect-TLD tracking algorithm introduction.

Source: Internet
Author: User
Tags tld

Transferred from:Http://blog.csdn.net/carson2005/article/details/7647500

TLD(Tracking-Learning-detection) is a new single-target long-term (Long term tracking) TrackingAlgorithm. This algorithm is significantly different from the traditional tracking algorithm in that it combines the traditional tracking algorithm with the traditional detection algorithm to solve the deformation and partial occlusion of the tracked object during the tracking process.. At the same time, through an improved online learning mechanism, the "significant feature points" of the tracking module and the target model and related parameters of the detection module are constantly updated, this makes the tracking effect more stable, robust, and reliable.

For long tracking, a key problem is that when the target re-appears in the camera's field of view, the system should be able to re-detect it and start re-tracking. However, during a long tracking process, shape changes, illumination conditions changes, scale changes, and occlusion are inevitable for the target to be tracked. In traditional tracking algorithms, the front-end needs to work with the detection module. After detecting the target to be tracked, it starts to enter the tracking module. After that, the detection module will not be involved in the tracking process. However, this method has a fatal defect: When the tracked target has shape changes or occlusion, the tracking will easily fail; therefore, for long-time tracking, or when the target to be tracked has a shape change, many people use the detection method instead of tracking. Although this method can improve the tracking effect in some cases, it requires an offline learning process. That is, before detection, You need to select a large number of samples of the target to be tracked for learning and training. This means that the training sample should cover the various deformation and various scales, pose changes and illumination changes that may occur to the tracked target. In other words, the detection method is used to achieve long-time tracking, which is crucial to the selection of training samples. Otherwise, the tracking robustness is hard to guarantee.

Considering that simple tracking or detection algorithms cannot achieve the desired effect in the long tracking process, the TLD method should consider combining the two, we also added an improved online learning mechanism to make the overall target tracking more stable and effective.

In short, the TLD algorithm consists of three parts: the tracking module, the detection module, and the learning module, as shown in.


Its operating mechanism is: the detection module and the tracking module perform parallel processing of complementary interference. First, the tracking module assumes that the motion of objects between adjacent video frames is limited and the target to be tracked is visible, so as to estimate the motion of the target. If the target disappears from the camera's view, the tracking will fail. The detection module assumes that each frame is independent from each other and searches for all images of each frame based on the target model detected and learned in the past to locate the areas where the target may appear. Like other target detection methods, the detection module in TLD may also encounter errors, and errors are nothing more than negative samples of errors and positive samples of errors. The Learning Module evaluates the two errors of the detection module based on the results of the tracking module, and generates training samples based on the evaluation results to update the target model of the detection module, at the same time, the "key feature points" of the tracking module are updated to avoid similar errors in the future. The detailed process of the TLD module is as follows:


Some basic knowledge and concepts need to be clarified before the TLD process is described in detail:

Basic knowledge:

Pn learning: http://blog.csdn.net/carson2005/article/details/7483027

The target to be tracked can be represented by its state attribute at any time. This status attribute can be a tracking box indicating the location and scale of the target, or a tag that identifies whether the target is visible. The spatial similarity between the two tracking boxes is measured by overlap. The calculation method is the intersection of the two tracking boxes and the Consortium of the two. The target shape is represented by an image patch (which is personally considered as a sliding window) p. Each image is sampled from the inside of the tracking box, and is normalized to 15*15. Two photos,





Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.