Lower bounds for non-convex stochastic optimization

Yossi Arjevani, Yair Carmon*, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ- 4 queries to find an ϵ-stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ- 3 queries, establishing the optimality of recently proposed variance reduction techniques.

Original languageEnglish
Pages (from-to)165-214
Number of pages50
JournalMathematical Programming
Volume199
Issue number1-2
DOIs
StatePublished - May 2023

Funding

FundersFunder number
Simons Institute for the Foundations of Deep Learning program
National Science Foundation1553086
Division of Computing and Communication Foundations
Alfred P. Sloan FoundationONR-YIP N00014-19-1-2288, 1740751
Google

    Fingerprint

    Dive into the research topics of 'Lower bounds for non-convex stochastic optimization'. Together they form a unique fingerprint.

    Cite this