TY - JOUR
T1 - Personalized and Energy-Efficient Health Monitoring
T2 - A Reinforcement Learning Approach
AU - Eden, Batchen
AU - Bistritz, Ilai
AU - Bambos, Nicholas
AU - Ben-Gal, Irad
AU - Khmelnitsky, Evgeni
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2023
Y1 - 2023
N2 - We consider a network of controlled sensors that monitor the unknown health state of a patient. We assume that the health state process is a Markov chain with a transition matrix that is unknown to the controller. At each timestep, the controller chooses a subset of sensors to activate, which incurs an energy (i.e., battery) cost. Activating more sensors improves the estimation of the unknown state, which introduces an energy-accuracy tradeoff. Our goal is to minimize the combined energy and state misclassification costs over time. Activating sensors now also provides measurements that can be used to learn the model, improving future decisions. Therefore, the learning aspect is intertwined with the energy-accuracy tradeoff. While Reinforcement Learning (RL) is often used when the model is unknown, it cannot be directly applied in health monitoring since the controller does not know the (health) state. Therefore, the monitoring problem is a partially observable Markov decision process (POMDP) where the cost feedback is also only partially available since the misclassification cost is unknown. To overcome this difficulty, we propose a monitoring algorithm that combines RL for POMDPs and online estimation of the expected misclassification cost based on a Hidden Markov Model (HMM). We show empirically that our algorithm achieves comparable performance with a monitoring system that assumes a known transition matrix and quantizes the belief state. It also outperforms the model-based approach where the estimated transition matrix is used for value iteration. Thus, our algorithm can be useful in designing energy-efficient and personalized health monitoring systems.
AB - We consider a network of controlled sensors that monitor the unknown health state of a patient. We assume that the health state process is a Markov chain with a transition matrix that is unknown to the controller. At each timestep, the controller chooses a subset of sensors to activate, which incurs an energy (i.e., battery) cost. Activating more sensors improves the estimation of the unknown state, which introduces an energy-accuracy tradeoff. Our goal is to minimize the combined energy and state misclassification costs over time. Activating sensors now also provides measurements that can be used to learn the model, improving future decisions. Therefore, the learning aspect is intertwined with the energy-accuracy tradeoff. While Reinforcement Learning (RL) is often used when the model is unknown, it cannot be directly applied in health monitoring since the controller does not know the (health) state. Therefore, the monitoring problem is a partially observable Markov decision process (POMDP) where the cost feedback is also only partially available since the misclassification cost is unknown. To overcome this difficulty, we propose a monitoring algorithm that combines RL for POMDPs and online estimation of the expected misclassification cost based on a Hidden Markov Model (HMM). We show empirically that our algorithm achieves comparable performance with a monitoring system that assumes a known transition matrix and quantizes the belief state. It also outperforms the model-based approach where the estimated transition matrix is used for value iteration. Thus, our algorithm can be useful in designing energy-efficient and personalized health monitoring systems.
KW - POMDP
KW - Reinforcement learning
KW - WBAN
KW - health monitoring
UR - http://www.scopus.com/inward/record.url?scp=85144813447&partnerID=8YFLogxK
U2 - 10.1109/LCSYS.2022.3229074
DO - 10.1109/LCSYS.2022.3229074
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85144813447
SN - 2475-1456
VL - 7
SP - 955
EP - 960
JO - IEEE Control Systems Letters
JF - IEEE Control Systems Letters
ER -