ETAPS 2019
Sat 6 - Thu 11 April 2019 Prague, Czech Republic
Tue 9 Apr 2019 14:30 - 15:00 at JUPITER - Machine Learning Chair(s): Bernhard Steffen

The possibility of using reinforcement learning in safety-critical settings have inspired several recent approaches toward obtaining formal safety guarantees for learning algorithms. Existing formal methods for learning and optimization primarily consider the problem of constrained learning or constrained optimization. Given a single correct model and associated safety constraint, these approaches guarantee efficient learning while provably avoiding behaviors outside the safety constraint. Acting well given an accurate environmental model is an important pre-requisite for safe learning, but is ultimately insufficient for systems that operate in complex heterogeneous environments.

This paper introduces verification-preserving model updates, the first approach toward obtaining formal safety guarantees for reinforcement learning in heterogeneous environments. Verification-preserving model updates (VPMUs) are proof-preserving manipulations to hybrid programs and their associated safety constraints. At design-time, VPMUs provide a framework for combining inductive and deductive program synthesis of provably safe hybrid dynamical systems. We also introduce model update learning, a family of reinforcement learning algorithms that leverage VPMUs to efficiently select accurate environmental models at runtime. VPMUs and model update learning provide the first approach toward obtaining formal proofs obtained via reinforcement learning algorithms even when an accurate environmental model is not available.

Tue 9 Apr

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:00
Machine LearningTACAS at JUPITER
Chair(s): Bernhard Steffen Technical University Dortmund
14:00
30m
Talk
Omega-Regular Objectives in Model-Free Reinforcement Learning
TACAS
Ernst Moritz Hahn Queen's University Belfast, Mateo Perez , Sven Schewe University of Liverpool, Fabio Somenzi , Ashutosh Trivedi , Dominik Wojtczak
Link to publication
14:30
30m
Talk
Verifiably Safe Off-Model Reinforcement Learning
TACAS
Nathan Fulton MIT-IBM Watson AI Lab, André Platzer Carnegie Mellon University
Link to publication