TY - GEN
T1 - Energy Regularized RNNS for solving non-stationary Bandit problems
AU - Rotman, Michael
AU - Wolf, Lior
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - We consider a Multi-Armed Bandit problem in which the re- wards are non-stationary and are dependent on past actions and potentially on past contexts. At the heart of our method, we employ a recurrent neural network, which models these sequences. In order to balance between exploration and exploitation, we present an energy minimization term that pre- vents the neural network from becoming too confident in support of a certain action. This term provably limits the gap between the maximal and minimal probabilities assigned by the network. In a diverse set of experiments, we demonstrate that our method is at least as effective as methods suggested to solve the sub-problem of Rotting Bandits, and can solve intuitive extensions of various benchmark problems. We share our implementation at https://github.com/rotmanmi/Energy-Regularized-RNN.
AB - We consider a Multi-Armed Bandit problem in which the re- wards are non-stationary and are dependent on past actions and potentially on past contexts. At the heart of our method, we employ a recurrent neural network, which models these sequences. In order to balance between exploration and exploitation, we present an energy minimization term that pre- vents the neural network from becoming too confident in support of a certain action. This term provably limits the gap between the maximal and minimal probabilities assigned by the network. In a diverse set of experiments, we demonstrate that our method is at least as effective as methods suggested to solve the sub-problem of Rotting Bandits, and can solve intuitive extensions of various benchmark problems. We share our implementation at https://github.com/rotmanmi/Energy-Regularized-RNN.
UR - http://www.scopus.com/inward/record.url?scp=85177594635&partnerID=8YFLogxK
U2 - 10.1109/ICASSP49357.2023.10095643
DO - 10.1109/ICASSP49357.2023.10095643
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85177594635
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
BT - ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Y2 - 4 June 2023 through 10 June 2023
ER -