arXiv:2103.01542 Abstract | arXiv Analytics

arXiv:2103.01542 [cs.CV]Abstract References Reviews Resources

TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning

Bingyan Liu, Yifeng Cai, Yao Guo, Xiangqun Chen

Published 2021-03-02Version 1

The increasing of pre-trained models has significantly facilitated the performance on limited data tasks with transfer learning. However, progress on transfer learning mainly focuses on optimizing the weights of pre-trained models, which ignores the structure mismatch between the model and the target task. This paper aims to improve the transfer performance from another angle - in addition to tuning the weights, we tune the structure of pre-trained models, in order to better match the target task. To this end, we propose TransTailor, targeting at pruning the pre-trained model for improved transfer learning. Different from traditional pruning pipelines, we prune and fine-tune the pre-trained model according to the target-aware weight importance, generating an optimal sub-model tailored for a specific target task. In this way, we transfer a more suitable sub-structure that can be applied during fine-tuning to benefit the final performance. Extensive experiments on multiple pre-trained models and datasets demonstrate that TransTailor outperforms the traditional pruning methods and achieves competitive or even better performance than other state-of-the-art transfer learning methods while using a smaller model. Notably, on the Stanford Dogs dataset, TransTailor can achieve 2.7% accuracy improvement over other transfer methods with 20% fewer FLOPs.

Comments: This paper has been accepted by AAAI2021

Categories: cs.CV, cs.LG

Keywords: pre-trained model, transtailor, performance, target-aware weight importance, specific target task

Related articles: Most relevant | Search more

arXiv:1803.05268 [cs.CV] (Published 2018-03-14)

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

David Mascharka, Philip Tran, Ryan Soklaski, Arjun Majumdar

arXiv:1906.10886 [cs.CV] (Published 2019-06-26)

Joint Multi-frame Detection and Segmentation for Multi-cell Tracking

Zibin Zhou, Fei Wang, Wenjuan Xi, Huaying Chen, Peng Gao, Chengkang He

arXiv:1912.01237 [cs.CV] (Published 2019-12-03)

EDAS: Efficient and Differentiable Architecture Search