We present a model for policy optimization of the online booking of mobile personnel over a multiday horizon with a different cutoff for each day, where the goal is to maximize the expected ratio of accepted requests at steady-state. This model fits the practice of many service providers who allow booking of time slots over a horizon of multiple days and use availability control of the demand. Since the planning horizon is indefinite and the service horizon of each day overlaps the horizon of subsequent days, the objective is defined in terms of the steady-state performance. The interactions with the customers are performed in a single step: The system offers an assortment of time slots covering the next few days, and the user either chooses one of them or abandons the system. Upon the arrival of a service request, the provider estimates the opportunity cost of serving the request at each of the available time slots. We model this cost as a linear function and a Cobb–Douglas function of features that concisely represent the current system state. The assortment of time slots following each service request is constructed by maximizing the expected net gain from the assortment. The parameters of the opportunity cost functions are fitted using a simulation framework. The proposed method is benchmarked based on randomly generated datasets in various demand scenarios and geographies. The method is shown to outperform more straightforward baseline policies significantly.
- Assortment optimization
- Field service management
- Multiday booking
- Reinforcement learning
- Stochastic dynamic optimization