arXiv:1711.02213 Abstract | arXiv Analytics

arXiv:1711.02213 [cs.LG]Abstract References Reviews Resources

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Urs Köster, Tristan Webb, Xin Wang, Marcel Nassar, Arjun Bansal, William Constable, Oguz Elibol, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby Pai, Naveen Rao

Published 2017-11-06Version 1

Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.

Comments: 14 pages, 5 figures, accepted in Neural Information Processing Systems 2017

Categories: cs.LG, cs.NA, stat.ML

Keywords: deep neural networks, adaptive numerical format, efficient training, floating point format, support modern deep network topologies

Related articles: Most relevant | Search more

arXiv:1705.03341 [cs.LG] (Published 2017-05-09)

Stable Architectures for Deep Neural Networks

Eldad Haber, Lars Ruthotto

arXiv:1611.05162 [cs.LG] (Published 2016-11-16)

Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks

Alireza Aghasi, Nam Nguyen, Justin Romberg

arXiv:1710.10570 [cs.LG] (Published 2017-10-29)

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics