Omega-Regular Objectives in Model-Free Reinforcement Learning
We provide the first solution for model-free reinforcement learning of omega-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of omega-regular objectives to an almost-sure reachability problem and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. A key feature of our technique is the compilation of omega-regular properties into limit-deterministic Buechi automata instead of the traditional Rabin automata; this choice sidesteps difficulties that have marred previous proposals. Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP. We present an experimental evaluation of our technique on benchmark learning problems.
Tue 9 AprDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:00 | |||
14:00 30mTalk | Omega-Regular Objectives in Model-Free Reinforcement Learning TACAS Ernst Moritz Hahn Queen's University Belfast, Mateo Perez , Sven Schewe University of Liverpool, Fabio Somenzi , Ashutosh Trivedi , Dominik Wojtczak Link to publication | ||
14:30 30mTalk | Verifiably Safe Off-Model Reinforcement Learning TACAS Link to publication |