Open Problem: Tight Convergence of SGD in Constant Dimension

Tomer Koren, Shahar Segal

Research output: Contribution to journalConference articlepeer-review

Abstract

Stochastic Gradient Descent (SGD) is one of the most popular optimization methods in machine learning and has been studied extensively since the early 50’s. However, our understanding of this fundamental algorithm is still lacking in certain aspects. We point out to a gap that remains between the known upper and lower bounds for the expected suboptimality of the last SGD point whenever the dimension is a constant independent of the number of SGD iterations T, and in particular, that the gap is still unaddressed even in the one dimensional case. For the latter, we provide evidence that the correct rate is Θ(1/√T) and conjecture that the same applies in any (constant) dimension.

Original languageEnglish
Pages (from-to)3847-3851
Number of pages5
JournalProceedings of Machine Learning Research
Volume125
StatePublished - 2020
Event33rd Conference on Learning Theory, COLT 2020 - Virtual, Online, Austria
Duration: 9 Jul 202012 Jul 2020

Fingerprint

Dive into the research topics of 'Open Problem: Tight Convergence of SGD in Constant Dimension'. Together they form a unique fingerprint.

Cite this