arXiv:2105.03343 Abstract | arXiv Analytics

arXiv:2105.03343 [cs.LG]Abstract References Reviews Resources

Adapting by Pruning: A Case Study on BERT

Published 2021-05-07Version 1

Adapting pre-trained neural models to downstream tasks has become the standard practice for obtaining high-quality models. In this work, we propose a novel model adaptation paradigm, adapting by pruning, which prunes neural connections in the pre-trained model to optimise the performance on the target task; all remaining connections have their weights intact. We formulate adapting-by-pruning as an optimisation problem with a differentiable loss and propose an efficient algorithm to prune the model. We prove that the algorithm is near-optimal under standard assumptions and apply the algorithm to adapt BERT to some GLUE tasks. Results suggest that our method can prune up to 50% weights in BERT while yielding similar performance compared to the fine-tuned full model. We also compare our method with other state-of-the-art pruning methods and study the topological differences of their obtained sub-networks.

Categories: cs.LG, cs.CL

Keywords: case study, novel model adaptation paradigm, prunes neural connections, yielding similar performance, obtaining high-quality models

Related articles: Most relevant | Search more

arXiv:2011.06485 [cs.LG] (Published 2020-11-12)

Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification

Robert Adragna, Elliot Creager, David Madras, Richard Zemel

arXiv:1806.07129 [cs.LG] (Published 2018-06-19)

Instance-Level Explanations for Fraud Detection: A Case Study

Dennis Collaris, Leo M. Vink, Jarke J. van Wijk

arXiv:1810.05524 [cs.LG] (Published 2018-10-10)

Introducing a hybrid model of DEA and data mining in evaluating efficiency. Case study: Bank Branches

Sara Hosseinzadeh Kassani, Peyman Hosseinzadeh Kassani, Seyed Esmaeel Najafi