TY - GEN

T1 - Bandit convex optimization

AU - Bubeck, Sébastien

AU - Dekel, Ofer

AU - Koren, Tomer

AU - Peres, Yuval

N1 - Publisher Copyright:
© 2015 A. Agarwal & S. Agarwal.

PY - 2015

Y1 - 2015

N2 - We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is θ∼(√T) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setting, where the convex loss functions are drawn from a worst-case distribution, and then we solve the Bayesian version of the problem with a variant of Thompson Sampling. Our analysis features a novel use of convexity, formalized as a "local-to-global" property of convex functions, that may be of independent interest.

AB - We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is θ∼(√T) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setting, where the convex loss functions are drawn from a worst-case distribution, and then we solve the Bayesian version of the problem with a variant of Thompson Sampling. Our analysis features a novel use of convexity, formalized as a "local-to-global" property of convex functions, that may be of independent interest.

UR - http://www.scopus.com/inward/record.url?scp=84984692046&partnerID=8YFLogxK

M3 - פרסום בספר כנס

AN - SCOPUS:84984692046

VL - 40

T3 - Proceedings of Machine Learning Research

BT - Proceedings of The 28th Conference on Learning Theory

A2 - Grünwald, Peter

A2 - Hazan, Elad

A2 - Kale, Satyen

PB - PMLR

Y2 - 2 July 2015 through 6 July 2015

ER -