Bandit smooth convex optimization: Improving the bias-variance tradeoff

Ofer Dekel, Ronen Eldan, Tomer Koren

Research output: Contribution to journalConference articlepeer-review

21 Scopus citations

Abstract

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of Õ(T5/6), while the best known lower bound is Ω(T1/2). Many attempts have been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of Õ(T2/3). We present an efficient algorithm for the bandit smooth convex optimization problem that guarantees a regret of Õ(T5/8). Our result rules out an Ω(T2/3) lower bound and takes a significant step towards the resolution of this open problem.

Original languageEnglish
Pages (from-to)2926-2934
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume2015-January
StatePublished - 2015
Externally publishedYes
Event29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada
Duration: 7 Dec 201512 Dec 2015

Fingerprint

Dive into the research topics of 'Bandit smooth convex optimization: Improving the bias-variance tradeoff'. Together they form a unique fingerprint.

Cite this