Never Go Full Batch (in Stochastic Convex Optimization)

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study the generalization performance of full-batch optimization algorithms for stochastic convex optimization: these are first-order methods that only access the exact gradient of the empirical risk (rather than gradients with respect to individual data points), that include a wide range of algorithms such as gradient descent, mirror descent, and their regularized and/or accelerated variants. We provide a new separation result showing that, while algorithms such as stochastic gradient descent can generalize and optimize the population risk to within ε after (1/ε2) iterations, full-batch methods either need at least Ω(1/ε4) iterations or exhibit a dimension-dependent sample complexity.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
PublisherNeural information processing systems foundation
Pages25033-25043
Number of pages11
ISBN (Electronic)9781713845393
StatePublished - 2021
Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Duration: 6 Dec 202114 Dec 2021

Publication series

NameAdvances in Neural Information Processing Systems
Volume30
ISSN (Print)1049-5258

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
CityVirtual, Online
Period6/12/2114/12/21

Funding

FundersFunder number
Israel Science Foundation2549/19, 2188/20
Yandex Initiative in Machine Learning
Blavatnik Family Foundation

    Fingerprint

    Dive into the research topics of 'Never Go Full Batch (in Stochastic Convex Optimization)'. Together they form a unique fingerprint.

    Cite this