Implicit regularization in deep matrix factorization

Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo

Research output: Contribution to journalConference articlepeer-review

251 Scopus citations

Abstract

Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low “complexity.” We study the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization. Our first finding, supported by theory and experiments, is that adding depth to a matrix factorization enhances an implicit tendency towards low-rank solutions, oftentimes leading to more accurate recovery. Secondly, we present theoretical and empirical arguments questioning a nascent view by which implicit regularization in matrix factorization can be captured using simple mathematical norms. Our results point to the possibility that the language of standard regularizers may not be rich enough to fully encompass the implicit regularization brought forth by gradient-based optimization.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Volume32
StatePublished - 2019
Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
Duration: 8 Dec 201914 Dec 2019

Funding

FundersFunder number
Amazon Research
Mozilla Research
Schmidt Foundation
Zuckerman Israeli Postdoctoral Scholars Program
National Science Foundation
Office of Naval Research
Semiconductor Research Corporation
Defense Advanced Research Projects Agency
Simons Foundation

    Fingerprint

    Dive into the research topics of 'Implicit regularization in deep matrix factorization'. Together they form a unique fingerprint.

    Cite this