Format

Send to

Choose Destination
Neural Comput. 2019 Mar;31(3):538-554. doi: 10.1162/neco_a_01165. Epub 2019 Jan 15.

State-Space Representations of Deep Neural Networks.

Author information

1
Department of Mechanical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A. mikebenh@gmail.com.
2
Department of Mechanical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A. sug375@psu.edu.
3
Department of Electrical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A. sys5880@psu.edu.
4
Department of Mechanical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A. axr2@psu.edu.

Abstract

This letter deals with neural networks as dynamical systems governed by finite difference equations. It shows that the introduction of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> -many skip connections into network architectures, such as residual networks and additive dense networks, defines <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> th order dynamical equations on the layer-wise transformations. Closed-form solutions for the state-space representations of general <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> th order additive dense networks, where the concatenation operation is replaced by addition, as well as <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> th order smooth networks, are found. The developed provision endows deep neural networks with an algebraic structure. Furthermore, it is shown that imposing <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> th order smoothness on network architectures with <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>d</mml:mi></mml:math> -many nodes per layer increases the state-space dimension by a multiple of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> , and so the effective embedding dimension of the data manifold by the neural network is <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>k</mml:mi><mml:mo>ยท</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:math> -many dimensions. It follows that network architectures of these types reduce the number of parameters needed to maintain the same embedding dimension by a factor of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msup><mml:mi>k</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math> when compared to an equivalent first-order, residual network. Numerical simulations and experiments on CIFAR10, SVHN, and MNIST have been conducted to help understand the developed theory and efficacy of the proposed concepts.

PMID:
30645180
DOI:
10.1162/neco_a_01165
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center