Abstract
We consider a problem of ordinal optimization where the objective is to select the best of several competing alternatives (“systems”) when the probability distributions governing each system’s performance are not known but can be learned via sampling. The objective is to dynamically allocate samples within a finite sampling budget to minimize the probability of selecting a system that is not the best. This objective does not possess an analytically tractable solution. We introduce a family of practically implementable sampling policies and show that the performance exhibits (asymptotically) near-optimal performance. Furthermore, we show via numerical testing that the proposed policies perform well compared with other benchmark policies.
Original language | English |
---|---|
Pages (from-to) | 1693-1712 |
Number of pages | 20 |
Journal | Operations Research |
Volume | 66 |
Issue number | 6 |
DOIs | |
State | Published - Nov 2018 |
Externally published | Yes |
Keywords
- Dynamic sampling
- Optimal learning
- Ranking and selection