Minimalist notes cross-stitch Networks for multi-task Learning

Source: Internet
Author: User
Minimalist notes cross-stitch Networks for multi-task Learning

Paper Address: https://arxiv.org/abs/1604.03539

This article studies the problem is the different levels of network weight sharing on the impact of multi-task learning, and on this basis, put forward cross-stitch units (cross-stitch unit) to achieve the optimal network sharing structure of automatic learning.

First of all, this article on the basis of alexnet, at different levels to expand Task-specific Branch, test tasks performance. The article uses < attribute classification, detection > and < plane method vector prediction, semantic segmentation > Two groups of visually related tasks to multitask learning. Experimental results as shown above,< attribute classification, detection > Task pairs, regardless of specific branch at what level, can not improve the performance of two tasks at the same time, that this group of tasks on the nature of contradictions, can not be used for joint training. The < plane vector prediction, semantic segmentation > in the middle part of the phenomenon of increasing performance at the same time, indicating that this group of tasks on the correlation is large, but also shows that the choice of specific branch level has a greater impact on the final performance.

Then the article put forward the cross-stitch unit, the reason is very simple, that is, there are two different tasks of the same structure of the network, and then in each corresponding feature map channel linear combination, and then input to the next network (linear combination parameters can be learned). (However, channel linear combination leads to instability in < attribute classification, detection > Training)

And then there's ablative analysis. The paper points out that the initial parameters of cross-stitch unit are plus and 1 (keep convexity), but the training process is not limited, and its parameter learning rate is greater than backbone net learning rate, which is helpful to accelerate convergence. In addition, training two task-specific network and then adding Cross-stitch finetuning, performance is superior to direct multi-task training. The article still experimented with cross-stitch different initial weights of the final training results cross-stitch weight value (as shown below), the results show that the initial weight set the impact is still very large, and from the limited test results αs:αdαs: αd \alpha_s: \alpha The greater the _d ratio, the better the performance

The paper also uses this structure to try to solve the problem of too few training samples, and the experimental results show that the multi-task form can improve the classification performance of small samples.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.