TY - JOUR
T1 - Bayesian Design Principles for Frequentist Sequential Learning
AU - Xu, Yunbei
AU - Zeevi, Assaf
N1 - Publisher Copyright:
© 2023 Proceedings of Machine Learning Research. All rights reserved.
PY - 2023
Y1 - 2023
N2 - We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to create “algorithmic beliefs” at each round, and use Bayesian posteriors to make decisions. This is the first approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the “best-of-all-worlds” empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, convex bandits, and reinforcement learning.
AB - We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to create “algorithmic beliefs” at each round, and use Bayesian posteriors to make decisions. This is the first approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the “best-of-all-worlds” empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, convex bandits, and reinforcement learning.
UR - http://www.scopus.com/inward/record.url?scp=85167780518&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85167780518
SN - 2640-3498
VL - 202
SP - 38768
EP - 38800
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 40th International Conference on Machine Learning, ICML 2023
Y2 - 23 July 2023 through 29 July 2023
ER -