TY - GEN

T1 - Online Learning for Load Balancing of Unknown Monotone Resource Allocation Games

AU - Bistritz, Ilai

AU - Bambos, Nicholas

N1 - Publisher Copyright:
Copyright © 2021 by the author(s)

PY - 2021

Y1 - 2021

N2 - Consider N players that each uses a mixture of K resources. Each of the players' reward functions includes a linear pricing term for each resource that is controlled by the game manager. We assume that the game is strongly monotone, so if each player runs gradient descent, the dynamics converge to a unique Nash equilibrium (NE). Unfortunately, this NE can be inefficient since the total load on a given resource can be very high. In principle, we can control the total loads by tuning the coefficients of the pricing terms. However, finding pricing coefficients that balance the loads requires knowing the players' reward functions and their action sets. Obtaining this game structure information is infeasible in a large-scale network and violates the users' privacy. To overcome this, we propose a simple algorithm that learns to shift the NE of the game to meet the total load constraints by adjusting the pricing coefficients in an online manner. Our algorithm only requires the total load per resource as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm guarantees convergence in L2 to a NE that meets target total load constraints. Simulations show the effectiveness of our approach when applied to smart grid demand-side management or power control in wireless networks.

AB - Consider N players that each uses a mixture of K resources. Each of the players' reward functions includes a linear pricing term for each resource that is controlled by the game manager. We assume that the game is strongly monotone, so if each player runs gradient descent, the dynamics converge to a unique Nash equilibrium (NE). Unfortunately, this NE can be inefficient since the total load on a given resource can be very high. In principle, we can control the total loads by tuning the coefficients of the pricing terms. However, finding pricing coefficients that balance the loads requires knowing the players' reward functions and their action sets. Obtaining this game structure information is infeasible in a large-scale network and violates the users' privacy. To overcome this, we propose a simple algorithm that learns to shift the NE of the game to meet the total load constraints by adjusting the pricing coefficients in an online manner. Our algorithm only requires the total load per resource as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm guarantees convergence in L2 to a NE that meets target total load constraints. Simulations show the effectiveness of our approach when applied to smart grid demand-side management or power control in wireless networks.

UR - http://www.scopus.com/inward/record.url?scp=85128696958&partnerID=8YFLogxK

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:85128696958

T3 - Proceedings of Machine Learning Research

SP - 968

EP - 979

BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021

PB - ML Research Press

T2 - 38th International Conference on Machine Learning, ICML 2021

Y2 - 18 July 2021 through 24 July 2021

ER -