TY - JOUR
T1 - Bayesian dynamic pricing policies
T2 - Learning and earning under a binary prior distribution
AU - Harrison, J. Michael
AU - Keskin, N. Bora
AU - Zeevi, Assaf
PY - 2012/3
Y1 - 2012/3
N2 - Motivated by applications in financial services, we consider a seller who offers prices sequentially to a stream of potential customers, observing either success or failure in each sales attempt. The parameters of the underlying demand model are initially unknown, so each price decision involves a trade-off between learning and earning. Attention is restricted to the simplest kind of model uncertainty, where one of two demand models is known to apply, and we focus initially on performance of the myopic Bayesian policy (MBP), variants of which are commonly used in practice. Because learning is passive under the MBP (that is, learning only takes place as a by-product of actions that have a different purpose), it can lead to incomplete learning and poor profit performance. However, under one additional assumption, a constrained variant of the myopic policy is shown to have the following strong theoretical virtue: the expected performance gap relative to a clairvoyant who knows the underlying demand model is bounded by a constant as the number of sales attempts becomes large.
AB - Motivated by applications in financial services, we consider a seller who offers prices sequentially to a stream of potential customers, observing either success or failure in each sales attempt. The parameters of the underlying demand model are initially unknown, so each price decision involves a trade-off between learning and earning. Attention is restricted to the simplest kind of model uncertainty, where one of two demand models is known to apply, and we focus initially on performance of the myopic Bayesian policy (MBP), variants of which are commonly used in practice. Because learning is passive under the MBP (that is, learning only takes place as a by-product of actions that have a different purpose), it can lead to incomplete learning and poor profit performance. However, under one additional assumption, a constrained variant of the myopic policy is shown to have the following strong theoretical virtue: the expected performance gap relative to a clairvoyant who knows the underlying demand model is bounded by a constant as the number of sales attempts becomes large.
KW - Bayesian learning
KW - Estimation
KW - Exploration-exploitation
KW - Pricing
KW - Revenue management
UR - http://www.scopus.com/inward/record.url?scp=84861366496&partnerID=8YFLogxK
U2 - 10.1287/mnsc.1110.1426
DO - 10.1287/mnsc.1110.1426
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84861366496
VL - 58
SP - 570
EP - 586
JO - Management Science
JF - Management Science
SN - 0025-1909
IS - 3
ER -