TY - JOUR
T1 - Reachability and Safety Objectives in Markov Decision Processes on Long but Finite Horizons
AU - Ashkenazi-Golan, Galit
AU - Flesch, János
AU - Predtetchinski, Arkadi
AU - Solan, Eilon
N1 - Publisher Copyright:
© 2020, The Author(s).
PY - 2020/6/1
Y1 - 2020/6/1
N2 - We consider discrete-time Markov decision processes in which the decision maker is interested in long but finite horizons. First we consider reachability objective: the decision maker’s goal is to reach a specific target state with the highest possible probability. A strategy is said to overtake another strategy, if it gives a strictly higher probability of reaching the target state on all sufficiently large but finite horizons. We prove that there exists a pure stationary strategy that is not overtaken by any pure strategy nor by any stationary strategy, under some condition on the transition structure and respectively under genericity. A strategy that is not overtaken by any other strategy, called an overtaking optimal strategy, does not always exist. We provide sufficient conditions for its existence. Next we consider safety objective: the decision maker’s goal is to avoid a specific state with the highest possible probability. We argue that the results proven for reachability objective extend to this model.
AB - We consider discrete-time Markov decision processes in which the decision maker is interested in long but finite horizons. First we consider reachability objective: the decision maker’s goal is to reach a specific target state with the highest possible probability. A strategy is said to overtake another strategy, if it gives a strictly higher probability of reaching the target state on all sufficiently large but finite horizons. We prove that there exists a pure stationary strategy that is not overtaken by any pure strategy nor by any stationary strategy, under some condition on the transition structure and respectively under genericity. A strategy that is not overtaken by any other strategy, called an overtaking optimal strategy, does not always exist. We provide sufficient conditions for its existence. Next we consider safety objective: the decision maker’s goal is to avoid a specific state with the highest possible probability. We argue that the results proven for reachability objective extend to this model.
KW - Markov decision process
KW - Overtaking optimality
KW - Perron–Frobenius eigenvalue
KW - Reachability objective
KW - Safety objective
UR - http://www.scopus.com/inward/record.url?scp=85085287808&partnerID=8YFLogxK
U2 - 10.1007/s10957-020-01681-2
DO - 10.1007/s10957-020-01681-2
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85085287808
SN - 0022-3239
VL - 185
SP - 945
EP - 965
JO - Journal of Optimization Theory and Applications
JF - Journal of Optimization Theory and Applications
IS - 3
ER -