Abstract
We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.
Original language | English |
---|---|
Pages (from-to) | 1562-1578 |
Number of pages | 17 |
Journal | Proceedings of Machine Learning Research |
Volume | 99 |
State | Published - 2019 |
Externally published | Yes |
Event | 32nd Conference on Learning Theory, COLT 2019 - Phoenix, United States Duration: 25 Jun 2019 → 28 Jun 2019 |
Funding
Funders | Funder number |
---|---|
National Science Foundation | CCF-1536002, CCF-1617790, CCF-1540541 |