Abstract
We consider the problem of controlling an unknown linear dynamical system under a stochastic convex cost and full feedback of both the state and cost function. We present a computationally efficient algorithm that attains an optimal √T regret-rate compared to the best stabilizing linear controller in hindsight. In contrast to previous work, our algorithm is based on the Optimism in the Face of Uncertainty paradigm. This results in a substantially improved computational complexity and a simpler analysis.
Original language | English |
---|---|
Pages (from-to) | 3589-3604 |
Number of pages | 16 |
Journal | Proceedings of Machine Learning Research |
Volume | 178 |
State | Published - 2022 |
Event | 35th Conference on Learning Theory, COLT 2022 - London, United Kingdom Duration: 2 Jul 2022 → 5 Jul 2022 |
Funding
Funders | Funder number |
---|---|
Deutsch Foundation | |
Yandex Initiative in Machine Learning | |
Blavatnik Family Foundation | |
Israel Science Foundation | 2549/19 |