Abstract
We consider a general class of network revenue management problems, where mean demand at each point in time is determined by a vector of prices, and the objective is to dynamically adjust these prices so as to maximize expected revenues over a finite sales horizon. A salient feature of our problem is that the decision maker can only observe realized demand over time but does not know the underlying demand function that maps prices into instantaneous demand rate. We introduce a family of "blind" pricing policies that are designed to balance trade-offs between exploration (demand learning) and exploitation (pricing to optimize revenues). We derive bounds on the revenue loss incurred by said policies in comparison to the optimal dynamic pricing policy that knows the demand function a priori, and we prove that asymptotically, as the volume of sales increases, this gap shrinks to zero.
Original language | English |
---|---|
Pages (from-to) | 1537-1550 |
Number of pages | 14 |
Journal | Operations Research |
Volume | 60 |
Issue number | 6 |
DOIs | |
State | Published - Nov 2012 |
Externally published | Yes |
Keywords
- Asymptotic optimality
- Curse of dimensionality
- Learning
- Minimax
- Network
- Nonparametric estimation
- Pricing
- Revenue management