arXiv Analytics

Sign in

arXiv:2010.07359 [cs.LG]AbstractReferencesReviewsResources

Effects of the Nonlinearity in Activation Functions on the Performance of Deep Learning Models

Nalinda Kulathunga, Nishath Rajiv Ranasinghe, Daniel Vrinceanu, Zackary Kinsman, Lei Huang, Yunjiao Wang

Published 2020-10-14Version 1

The nonlinearity of activation functions used in deep learning models are crucial for the success of predictive models. There are several commonly used simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU (L-ReLU). In practice, these functions remarkably enhance the model accuracy. However, there is limited insight into the functionality of these nonlinear activation functions in terms of why certain models perform better than others. Here, we investigate the model performance when using ReLU or L-ReLU as activation functions in different model architectures and data domains. Interestingly, we found that the application of L-ReLU is mostly effective when the number of trainable parameters in a model is relatively small. Furthermore, we found that the image classification models seem to perform well with L-ReLU in fully connected layers, especially when pre-trained models such as the VGG-16 are used for the transfer learning.

Related articles: Most relevant | Search more
arXiv:2203.11196 [cs.LG] (Published 2022-03-18)
Performance of Deep Learning models with transfer learning for multiple-step-ahead forecasts in monthly time series
arXiv:2011.06796 [cs.LG] (Published 2020-11-13)
Wisdom of the Ensemble: Improving Consistency of Deep Learning Models
Lijing Wang et al.
arXiv:2202.07201 [cs.LG] (Published 2022-02-15)
Holistic Adversarial Robustness of Deep Learning Models