Lower bounds for non-convex stochastic optimization

Yossi Arjevani, Yair Carmon*, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

43 Scopus citations

Abstract

We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ- 4 queries to find an ϵ-stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ- 3 queries, establishing the optimality of recently proposed variance reduction techniques.

Original languageEnglish
Pages (from-to)165-214
Number of pages50
JournalMathematical Programming
Volume199
Issue number1-2
DOIs
StatePublished - May 2023

Fingerprint

Dive into the research topics of 'Lower bounds for non-convex stochastic optimization'. Together they form a unique fingerprint.

Cite this