arXiv:2205.09459 Abstract | arXiv Analytics

arXiv:2205.09459 [cs.LG]Abstract References Reviews Resources

Neural Network Architecture Beyond Width and Depth

Published 2022-05-19Version 1

This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyperparameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyperparameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height $s$ is built with each hidden neuron activated by a NestNet of height $\le s-1$. When $s=1$, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-$s$ ReLU NestNets with $\mathcal{O}(n)$ parameters can approximate Lipschitz continuous functions on $[0,1]^d$ with an error $\mathcal{O}(n^{-(s+1)/d})$, while the optimal approximation error of standard ReLU networks with $\mathcal{O}(n)$ parameters is $\mathcal{O}(n^{-2/d})$. Furthermore, such a result is extended to generic continuous functions on $[0,1]^d$ with the approximation error characterized by the modulus of continuity. Finally, a numerical example is provided to explore the advantages of the super approximation power of ReLU NestNets.

Categories: cs.LG, stat.ML

Keywords: neural network architecture, two-dimensional architecture, relu nestnets, three-dimensional architectures, super approximation power

Related articles: Most relevant | Search more

arXiv:1905.08300 [cs.LG] (Published 2019-05-20)

A Neural Network Architecture for Learning Word-Referent Associations in Multiple Contexts

Hansenclever F. Bassani, Aluizio F. R. Araujo

arXiv:1909.03306 [cs.LG] (Published 2019-09-07)

A greedy constructive algorithm for the optimization of neural network architectures

Massimiliano Lupo Pasini, Junqi Yin, Ying Wai Li, Markus Eisenbach

arXiv:2006.02250 [cs.LG] (Published 2020-06-03)

dynoNet: a neural network architecture for learning dynamical systems

Marco Forgione, Dario Piga