Topic Center

Contact Sales

Home > Others

Learning multi-domain convolutional Neural Networks for Visual tracking notes

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper uses DL to do visual tracking and currently implements state of the art on object tracking benchmark and VOT2015. The method of the article is more intuitive:

The author calls it Multi_domain network. In essence, the author first takes a large number of positive and negative samples of the first frame to train the CONV1-CONV2-CONV3-FC4-FC5 network, and for FC6, for different input image sequences ( For example, basketball in OTB, Bolt) trained this fc6 alone. The k in the image above corresponds to the number of domain, that is, the number of different categories in the OTB. Note that the number of different videos in OTB is not equal to the number of categories, because some videos belong to the same class, such as Car4,cardark,carscale. So the author simply regards these as a domain, the training time FC6 layer is the same, this is the so-called domain source.

And then we'll talk specifically about what the author does:

First, the author needs to get a large number of positive and negative samples from the first frame of all the image sequences in the DataSet to train the network above, which is an iterative process because of the need to select some "good" negative samples:

The choice is called hard negtive minding, and the intuitive result is to try to select the Mini-batch near the positive sample (which, after all, is the most helpful to distinguish), batch the resized into the 107*107 size, input into the network, Iterative knowledge of network convergence is always iterated. Note that each domain corresponds to a single fc6 layer, and the front layer is jointly trained. Thus appeared the so-called Multi-domain network.

The second step is to use the training model to do tracking. Because the network itself is not very complex, so in the process of tracking, the author took the first frame of the network also finetune, rather than as a general network just update the classifier. Meanwhile, in the process of tracking, the author adopts different update strategies for different situations: long term updating and short term updating. The model is immediately updated when the target is categorized into a background. At the same time, there may be a situation where the positive samples obtained from the predicted position of a frame differ greatly from the real ground_truthc, that is, these positive samples do not have a good match real ground_truth, in order to solve this problem, the author will Detection's bounding box regression technology blends in to try to match the predicted bounding box with ground truth.

The third step is forecast.

The authors predict 256 candidates for each frame, and have different positions and scales. Then find out the best, the position of the target in the frame.

Experiment IV

The first experiment was done on OTB, with 100 videos, and the authors trained vot2013,2014 and 2015 videos, which required a lot of data training anyway. The results on OTB50 and OTB100 are as follows:

It is very high indeed.

It is worth mentioning that the author uses two techniques:hard negative minding and bounding box regression. The former has great influence on the precision (DP) curve, and the latter has great influence on the overlap rate (OP) precision. It can be seen from the following comparative experiments:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Learning multi-domain convolutional Neural Networks for Visual tracking notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support