Bandit smooth convex optimization: Improving the bias-variance tradeoff

Ofer Dekel, Ronen Eldan, Tomer Koren

Research output: Contribution to journalConference articlepeer-review

Abstract

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of Õ(T5/6), while the best known lower bound is Ω(T1/2). Many attempts have been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of Õ(T2/3). We present an efficient algorithm for the bandit smooth convex optimization problem that guarantees a regret of Õ(T5/8). Our result rules out an Ω(T2/3) lower bound and takes a significant step towards the resolution of this open problem.

Original languageEnglish
Pages (from-to)2926-2934
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume2015-January
StatePublished - 2015
Externally publishedYes
Event29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada
Duration: 7 Dec 201512 Dec 2015

Fingerprint

Dive into the research topics of 'Bandit smooth convex optimization: Improving the bias-variance tradeoff'. Together they form a unique fingerprint.

Cite this