Verifiably Safe Off-Model Reinforcement Learning
The possibility of using reinforcement learning in safety-critical settings have inspired several recent approaches toward obtaining formal safety guarantees for learning algorithms. Existing formal methods for learning and optimization primarily consider the problem of constrained learning or constrained optimization. Given a single correct model and associated safety constraint, these approaches guarantee efficient learning while provably avoiding behaviors outside the safety constraint. Acting well given an accurate environmental model is an important pre-requisite for safe learning, but is ultimately insufficient for systems that operate in complex heterogeneous environments.
This paper introduces verification-preserving model updates, the first approach toward obtaining formal safety guarantees for reinforcement learning in heterogeneous environments. Verification-preserving model updates (VPMUs) are proof-preserving manipulations to hybrid programs and their associated safety constraints. At design-time, VPMUs provide a framework for combining inductive and deductive program synthesis of provably safe hybrid dynamical systems. We also introduce model update learning, a family of reinforcement learning algorithms that leverage VPMUs to efficiently select accurate environmental models at runtime. VPMUs and model update learning provide the first approach toward obtaining formal proofs obtained via reinforcement learning algorithms even when an accurate environmental model is not available.
Tue 9 Apr Times are displayed in time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
|14:00 - 14:30|
Ernst Moritz HahnQueen's University Belfast, Mateo Perez, Sven ScheweUniversity of Liverpool, Fabio Somenzi, Ashutosh Trivedi, Dominik WojtczakLink to publication
|14:30 - 15:00|
|Link to publication|