arXiv Analytics

Sign in

arXiv:1904.03392 [cs.LG]AbstractReferencesReviewsResources

Effective and Efficient Dropout for Deep Convolutional Neural Networks

Shaofeng Cai, Jinyang Gao, Meihui Zhang, Wei Wang, Gang Chen, Beng Chin Ooi

Published 2019-04-06Version 1

Machine-learning-based data-driven applications have become ubiquitous, e.g., health-care analysis and database system optimization. Big training data and large (deep) models are crucial for good performance. Dropout has been widely used as an efficient regularization technique to prevent large models from overfitting. However, many recent works show that dropout does not bring much performance improvement for deep convolutional neural networks (CNNs), a popular deep learning model for data-driven applications. In this paper, we formulate existing dropout methods for CNNs under the same analysis framework to investigate the failures. We attribute the failure to the conflicts between the dropout and the batch normalization operation after it. Consequently, we propose to change the order of the operations, which results in new building blocks of CNNs.Extensive experiments on benchmark datasets CIFAR, SVHN and ImageNet have been conducted to compare the existing building blocks and our new building blocks with different dropout methods. The results confirm the superiority of our proposed building blocks due to the regularization and implicit model ensemble effect of dropout. In particular, we improve over state-of-the-art CNNs with significantly better performance of 3.17%, 16.15%, 1.44%, 21.46% error rate on CIFAR-10, CIFAR-100, SVHN and ImageNet respectively.

Related articles: Most relevant | Search more
arXiv:1908.11694 [cs.LG] (Published 2019-08-29)
Estimation of Body Mass Index from Photographs using Deep Convolutional Neural Networks
arXiv:1812.07390 [cs.LG] (Published 2018-12-16)
Distill-Net: Application-Specific Distillation of Deep Convolutional Neural Networks for Resource-Constrained IoT Platforms
arXiv:1603.04833 [cs.LG] (Published 2016-03-15)
Ensemble of Deep Convolutional Neural Networks for Learning to Detect Retinal Vessels in Fundus Images