Abstract
We revisit the fundamental problem of prediction with expert advice, in a setting where the environment is benign and generates losses stochastically, but the feedback observed by the learner is subject to a moderate adversarial corruption. We prove that a variant of the classical Multiplicative Weights algorithm with decreasing step sizes achieves constant regret in this setting and performs optimally in a wide range of environments, regardless of the magnitude of the injected corruption. Our results reveal a surprising disparity between the often comparable Follow the Regularized Leader (FTRL) and Online Mirror Descent (OMD) frameworks: we show that for experts in the corrupted stochastic regime, the regret performance of OMD is in fact strictly inferior to that of FTRL.
Original language | English |
---|---|
Journal | Advances in Neural Information Processing Systems |
Volume | 2020-December |
State | Published - 2020 |
Event | 34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online Duration: 6 Dec 2020 → 12 Dec 2020 |
Funding
Funders | Funder number |
---|---|
Yandex Initiative in Machine Learning | |
Horizon 2020 Framework Programme | 882396 |
European Research Council | |
Israel Science Foundation | 2549/19, 2188/20, 993/17 |