We provide a tight bound on the amount of experimentation under the optimal strategy in sequential decision problems. We show the applicability of the result by providing a bound on the cut-off in a one-arm bandit problem.
|Number of pages||5|
|Journal||Statistics and Probability Letters|
|State||Published - 2010|