Sequential decision making with vector outcomes

Yossi Azar, Uriel Feige, Michal Feldman, Moshe Tennenholtz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

We study a multi-round optimization setting in which in each round a player may select one of several actions, and each action produces an outcome vector, not observable to the player until the round ends. The final payoff for the player is computed by applying some known function f to the sum of all outcome vectors (e.g., the minimum of all coordinates of the sum). We show that standard notions of performance measure (such as comparison to the best single action) used in related expert and bandit settings (in which the payoff in each round is scalar) are not useful in our vector setting. Instead, we propose a different performance measure, and design algorithms that have vanishing regret with respect to our new measure.

Original languageEnglish
Title of host publicationITCS 2014 - Proceedings of the 2014 Conference on Innovations in Theoretical Computer Science
PublisherAssociation for Computing Machinery
Pages195-205
Number of pages11
ISBN (Print)9781450322430
DOIs
StatePublished - 2014
Event2014 5th Conference on Innovations in Theoretical Computer Science, ITCS 2014 - Princeton, NJ, United States
Duration: 12 Jan 201414 Jan 2014

Publication series

NameITCS 2014 - Proceedings of the 2014 Conference on Innovations in Theoretical Computer Science

Conference

Conference2014 5th Conference on Innovations in Theoretical Computer Science, ITCS 2014
Country/TerritoryUnited States
CityPrinceton, NJ
Period12/01/1414/01/14

Keywords

  • Bandit
  • Expert
  • Vector outcome

Fingerprint

Dive into the research topics of 'Sequential decision making with vector outcomes'. Together they form a unique fingerprint.

Cite this