Analysis of TLD Visual Tracking Technology

Last Update:2018-12-04 Source: Internet

Author: User

Tags tld

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Analysis of TLD Visual Tracking Technology

In the monitoring of urban rail transit, the intelligent video analysis technology once experienced great challenges. However, due to the complicated monitoring environment of urban rail transit, it not only has a large area, long perimeter, multiple platforms, multiple entrances and exits, numerous fences, and other related equipment. This complex environment brings many difficulties to Intelligent Analysis. As a new TLD [tracking-Learning-detection "(tracking-Learning-detection) the abbreviation] visual tracing technology can solve these problems.

The biggest feature of the TLD tracking system is that it can continuously learn the locked target to obtain the latest Appearance Features of the target, so as to timely improve the tracking to achieve the best state. That is to say, at the beginning, only one static target image is provided, but with the continuous movement of the target, the system can continuously perform detection, you can learn the changes in the angle, distance, depth of field, and identify the target in real time. After a period of learning, the target can no longer be escaped.

TLD technology consists of three parts: Tracker, learning process, and detector. TLD is an adaptive and reliable tracking technology that combines tracing and detection. In TLD technology, the tracker and detector run in parallel, and the results of both are involved in the learning process. The learned model is opposite to the tracker and detector, and is updated in real time, this ensures that the target can be tracked continuously even when its appearance changes.

　　Tracker

The TLD tracker uses the overlapping block tracking policy, and the single block tracking uses the Lucas-kanade optical flow method. Before tracking, the TLD needs to specify the target to be tracked, which is marked by a rectangle. In the end, the motion of the overall target is taken from the mean value of all partial blocks. This Local tracking policy can solve the problem of partial occlusion.

Learning Process

The learning process of TLD is based on the online model. An online model is a collection of 15x15 image blocks derived from the tracker and inspector, the initial online model is the target image to be tracked as specified during the initial tracking.

An online model is a dynamic model that increases or decreases with the video sequence. The development of online models is driven by two events: Growth events and pruning events. In reality, the appearance of the target is constantly changing due to the influence of multiple factors, such as the environment and the target itself, this allows the target image predicted by the tracker to include more factors of interest. If we regard all target images on the tracking track as a feature space, the feature space caused by the tracker will increase with the video sequence, which is a growth event. In order to prevent the effect of the tracking effect from the impurity (other non-target images) caused by the growth event, a trim event is used to balance the effect. Trim events are used to remove impurities caused by growth events. As a result, the interaction between the two events prompted the online model to remain consistent with the current tracking target.

The feature space expansion brought about by growth events comes from the tracker, that is, selecting appropriate samples from the target image on the tracking track and updating the online model. Three selection policies are available, as shown in the following figure.

· Image blocks similar to the initial target image to be tracked are added to the online model;

· If the target image of the current frame is similar to that of the previous frame, add the current tracking result image to the online model;

· Calculate the distance between the target image on the tracking track and the online model, and select the target image with a specific pattern, that is, the distance between the target image and the online model is small at first, and the distance increases gradually, then the distance is restored to a small State. Cyclically check whether this mode exists and add the target image in this mode to the online model.

The Feature Selection Method of growth events ensures that the online model always follows the latest state of the target, and avoids the loss of tracking because the model update is not real-time. The last selection strategy is also one of the characteristics of TLD technology, which reflects the characteristics of adaptive tracking. When the tracking changes, the tracker automatically adapts to the background instead of suddenly moving to the tracking target.

The TRIM event assumes that each frame has only one target. When both the tracker and detector recognize the target location, the remaining detection images are considered as error samples and are deleted from the online model.

Samples in the online model provide materials for TLD learning. In addition, TLD adopts two constraints during training and generation of classifier (random forest): p constraint and N constraint. The P constraint specifies that the image block closest to the target image on the tracking track is a positive sample; otherwise, it is a negative sample, that is, N constraint. Pn constraints reduce the classifier error rate. within a certain range, the error rate approaches zero.

　Detector

TLD technology designs a fast and reliable detector, which provides necessary support for the tracker. When the result obtained by the tracker fails, the result of the detector must be used to supplement the correction and reinitialize the tracker. The procedure is as follows.

The tracker and detector run at the same time on each frame. The tracker predicts a target location, while the detector may detect multiple images;

· When determining the final position of the target, the result obtained by the tracker is given priority. That is, if the similarity between the tracked image and the initial target image is greater than a certain threshold, the tracking result is accepted. Otherwise, the image with the highest similarity with the initial target will be selected as the tracking result from the detector results;

· If it is the latter in step 2, update the initial target model of the tracker, replace the original target model with the selected tracking result, and delete the samples in the previous model, start again with a new sample.

A detector is a random forest classifier generated by training samples in an online model. The selected feature is the edge direction of the region, which is called the 2bitbp feature. It is not subject to the interference of light. Features are quantified, and there are four possible encodings. Feature encoding is unique for a given region. Multi-scale feature computing can adopt the integral image method.

Each graphic block is represented by a large number of 2bitbp features, and these features are divided into different groups of the same size and size. Each group represents a different representation of the image block appearance. The classifier used for detection adopts the form of random forest. A random forest is composed of trees, and each tree is constructed by a feature group. Each feature of the tree serves as a decision node.

The random forest completes online update and evolution through growth events and pruning events. At the beginning, each tree is constructed from the feature Group of the initial target template and has only one "branch ". As the growth event selects positive samples, the random forest also adds new "branches". The trim event removes the "branches" that are not used in the random forest ". This real-time detector uses the scanning window policy: scanning the input frame according to the position and scale, and applying classifier to each sub-window to determine whether the image belongs to the target image.

TLD technology skillfully integrates tracker, detector and learning process to achieve target tracking.

Article from: http://networking.asmag.com.cn/n-50168.shtml reprint please indicate the source.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More