Constrained no-regret learning

Ye Du, Ehud Lehrer*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We investigate a dynamic decision making problem with constraints. The decision maker is free to take any action as long as the empirical frequency of the actions played does not violate pre-specified constraints. In a case of violation the decision maker is penalized. We introduce the constrained no-regret learning model. In this model the set of alternative strategies, with which a dynamic decision policy is compared, is the set of stationary mixed actions that satisfy all the constraints. We show that there exists a strategy that satisfies the following properties: (i) it guarantees that after an unavoidable deterministic grace period, there are absolutely no violations; (ii) for an arbitrarily small constant ϵ>0, it achieves a convergence rate of T−[Formula presented], which improves the O(T−[Formula presented]) convergence rate of Mannor et al. (2009).

Original languageEnglish
Pages (from-to)16-24
Number of pages9
JournalJournal of Mathematical Economics
Volume88
DOIs
StatePublished - May 2020

Funding

FundersFunder number
National Natural Science Foundation of China11761141007, 11501464
Israel Science Foundation2510/17, 963/15

    Keywords

    • Approachability
    • Constrained no-regret
    • No-regret strategy
    • On-line learning algorithm

    Fingerprint

    Dive into the research topics of 'Constrained no-regret learning'. Together they form a unique fingerprint.

    Cite this