Abstract
We study a sequential matching problem faced by large centralized platforms where "jobs"must be matched to "workers"subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete times (possibly in batches of stochastic size and composition) with "job-types"observable upon arrival. To capture the "choice overload"phenomenon, we posit an unlimited supply of workers where each worker is characterized by a vector of attributes (aka "worker-types") sampled from an underlying population-level distribution. The distribution as well as mean payoffs for possible workerjob type-pairs are unobservables and the platform's goal is to sequentially match incoming jobs to workers in a way that maximizes its cumulative payoffs over the planning horizon. We establish lower bounds on the regret of any matching algorithm in this setting and propose a novel rate-optimal learning algorithm that adapts to aforementioned primitives online. Our learning guarantees highlight a distinctive characteristic of the problem: achievable performance only has a second-order dependence on worker-type distributions; we believe this finding may be of interest more broadly.
Original language | English |
---|---|
Pages (from-to) | 18-20 |
Number of pages | 3 |
Journal | Performance Evaluation Review |
Volume | 50 |
Issue number | 2 |
DOIs | |
State | Published - 30 Aug 2022 |
Externally published | Yes |
Keywords
- dynamic learning
- matching algorithms
- multi-armed bandits
- regret analysis
- two-sided markets