arXiv:1902.05967 Abstract | arXiv Analytics

arXiv:1902.05967 [cs.LG]Abstract References Reviews Resources

Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Published 2019-02-15Version 1

Deep neural networks are typically highly over-parameterized with pruning techniques able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic re-allocation of non-zero parameters have emerged for training sparse networks directly without having to train a large dense model beforehand. We present a parameter re-allocation scheme that addresses the limitations of previous methods such as their high computational cost and the fixed number of parameters they allocate to each layer. We investigate the performance of these dynamic re-allocation methods in deep convolutional networks and show that our method outperforms previous static and dynamic parameterization methods, yielding the best accuracy for a given number of training parameters, and performing on par with networks obtained by iteratively pruning a trained dense model. We further investigated the mechanisms underlying the superior performance of the resulting sparse networks. We found that neither the structure, nor the initialization of the sparse networks discovered by our parameter reallocation scheme are sufficient to explain their superior generalization performance. Rather, it is the continuous exploration of different sparse network structures during training that is critical to effective learning. We show that it is more fruitful to explore these structural degrees of freedom than to add extra parameters to the network.

Categories: cs.LG, stat.ML

Keywords: deep convolutional neural networks, dynamic sparse reparameterization, parameter efficient training, sparse network, dynamic re-allocation

Related articles: Most relevant | Search more

arXiv:1901.08624 [cs.LG] (Published 2019-01-24)

AutoShuffleNet: Learning Permutation Matrices via an Exact Lipschitz Continuous Penalty in Deep Convolutional Neural Networks

Jiancheng Lyu, Shuai Zhang, Yingyong Qi, Jack Xin

arXiv:1809.09399 [cs.LG] (Published 2018-09-25)

Non-Iterative Knowledge Fusion in Deep Convolutional Neural Networks

Mikhail Iu. Leontev, Viktoriia Islenteva, Sergey V. Sukhov

arXiv:1809.05606 [cs.LG] (Published 2018-09-14)

Non-iterative recomputation of dense layers for performance improvement of DCNN

Yimin Yang, Q. M. Jonathan Wu, Xiexing Feng, Thangarajah Akilan