Stable tensor neural networks for efficient deep learning

Elizabeth Newman*, Lior Horesh, Haim Avron, Misha E. Kilmer

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.

Original languageEnglish
Article number1363978
JournalFrontiers in Big Data
Volume7
DOIs
StatePublished - 2024

Funding

FundersFunder number
International Business Machines Corporation
National Science FoundationDMS-2309751

    Keywords

    • deep learning
    • image classification
    • inverse problems
    • machine learning
    • tensor algebra

    Fingerprint

    Dive into the research topics of 'Stable tensor neural networks for efficient deep learning'. Together they form a unique fingerprint.

    Cite this