Constrained no-regret learning

Research output: Contribution to journalArticlepeer-review

Abstract

We investigate a dynamic decision making problem with constraints. The decision maker is free to take any action as long as the empirical frequency of the actions played does not violate pre-specified constraints. In a case of violation the decision maker is penalized. We introduce the constrained no-regret learning model. In this model the set of alternative strategies, with which a dynamic decision policy is compared, is the set of stationary mixed actions that satisfy all the constraints. We show that there exists a strategy that satisfies the following properties: (i) it guarantees that after an unavoidable deterministic grace period, there are absolutely no violations; (ii) for an arbitrarily small constant ϵ>0, it achieves a convergence rate of T−[Formula presented], which improves the O(T−[Formula presented]) convergence rate of Mannor et al. (2009).

Original languageEnglish
Pages (from-to)16-24
Number of pages9
JournalJournal of Mathematical Economics
Volume88
DOIs
StatePublished - May 2020

Keywords

  • Approachability
  • Constrained no-regret
  • No-regret strategy
  • On-line learning algorithm

Fingerprint

Dive into the research topics of 'Constrained no-regret learning'. Together they form a unique fingerprint.

Cite this