Abstract
We consider the subclass of linear programs that formulate Markov Decision Processes (mdps). We show that the Simplex algorithm with the Gass-Saaty shadow-vertex pivoting rule is strongly polynomial for a subclass of mdps, called controlled random walks (CRWs); the running time is O({pipe}S{pipe}3{dot operator}{pipe}U{pipe}2), where {pipe}S{pipe} denotes the number of states and {pipe}U{pipe} denotes the number of actions per state. This result improves the running time of Zadorojniy et al. (Mathematics of Operations Research 34(4):992-1007, 2009) algorithm by a factor of {pipe}S{pipe}. In particular, the number of iterations needed by the Simplex algorithm for CRWs is linear in the number of states and does not depend on the discount factor.
| Original language | English |
|---|---|
| Pages (from-to) | 159-167 |
| Number of pages | 9 |
| Journal | Annals of Operations Research |
| Volume | 201 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2012 |
Keywords
- Controlled queues
- Controlled random walks
- Gass-Saaty shadow-vertex pivoting rule
- Markov decision process
- Simplex algorithm
Fingerprint
Dive into the research topics of 'Strong polynomiality of the Gass-Saaty shadow-vertex pivoting rule for controlled random walks'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver