An infinite random stream of ordered pairs arrives sequentially in discrete time. A pair consists of a “candidate” and an “offer,” each of which is either of type 1 (with probability p) or of type II (with probability q = 1 − p). Offers are to be assigned to candidates, yielding a reward R > 0 if they match in type, or a smaller reward 0 ≤ r ≤ R if not. An arriving candidate resides in the system until it is assigned, whereas an arriving offer is either assigned immediately to one of the waiting candidates or lost forever. We show that the optimal long-term average reward is R, independent of the population proportion p and the “second prize” r, and that the optimal average reward policy is to assign only a match. Optimal policies for discounted and finite horizon models are also derived.
|Number of pages||18|
|Journal||Probability in the Engineering and Informational Sciences|
|State||Published - 1990|