An Intentional Forgetting-Driven Self-Healing Method For Deep Reinforcement Learning Systems (ASE 2023 - Research Papers)

Who

Ahmed Haj Yahmed, Rached Bouchoucha, Houssem Ben Braiek , Foutse Khomh

Track

ASE 2023 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 14 Sep 2023 14:18 - 14:30 at Room C - Testing AI Systems 4

Abstract

Deep reinforcement learning (DRL) is increasingly applied in large-scale productions like Netflix and Facebook. As with most data-driven systems, DRL systems can exhibit undesirable behaviors due to environmental drifts, which often occur in constantly-changing production settings. Continual Learning (CL) is the inherent self-healing approach for adapting the DRL agent in response to the environment’s conditions shifts. However, successive shifts of considerable magnitude may cause the production environment to drift from its original state. Recent studies have shown that these environmental drifts tend to drive CL into long, or even unsuccessful, healing cycles, which arise from inefficiencies such as catastrophic forgetting, warm-starting failure, and slow convergence. In this paper, we propose Dr. DRL, an effective self-healing approach for DRL systems that integrates a novel mechanism of intentional forgetting into vanilla CL to overcome its main issues. Dr. DRL deliberately erases the DRL system’s minor behaviors to systematically prioritize the adaptation of the key problem-solving skills. Using well-established DRL algorithms, Dr. DRL is compared with vanilla CL on various drifted environments. Dr. DRL is able to reduce, on average, the healing time and fine-tuning episodes by, respectively, 18.74% and 17.72%. Dr. DRL successfully helps agents to adapt to 19.63% of drifted environments left unsolved by vanilla CL while maintaining and even enhancing by up to 45% the obtained rewards for drifted environments that are resolved by both approaches.

Link to Preprint

https://arxiv.org/abs/2308.12445

Ahmed Haj Yahmed

École Polytechnique de Montréal

Canada

Rached Bouchoucha

Polytechnique Montréal

Houssem Ben Braiek

Polytechnique Montréal

Foutse Khomh

Polytechnique Montréal

Canada

Registered Presentation of the paper An Intentional Forgetting-Driven Self-Healing Method For Deep Reinforcement Learning Systems

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 14 Sep
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

13:30 - 15:00	Testing AI Systems 4Research Papers / NIER Track at Room C

13:30 12m Talk		Mutation-based Fault Localization of Deep Neural Networks Research Papers Ali Ghanbari Iowa State University, Deepak-George Thomas Dept. of Computer Science, Iowa State University, Muhammad Arbab Arshad Dept. of Computer Science, Iowa State University, Hridesh Rajan Iowa State University Pre-print
13:42 12m Talk		Fault Localization for Buggy Deep Learning Framework Conversions in Image Recognition NIER Track Nikolaos Louloudakis University of Edinburgh, Perry Gibson University of Glasgow, José Cano University of Glasgow, Ajitha Rajan University of Edinburgh Pre-print File Attached
13:54 12m Talk		Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution NIER Track Raffi Khatchadourian City University of New York (CUNY) Hunter College, Tatiana Castro Vélez City University of New York (CUNY) Graduate Center, Mehdi Bagherzadeh Oakland University, Nan Jia City University of New York (CUNY) Graduate Center, Anita Raja City University of New York (CUNY) Hunter College Pre-print Media Attached
14:06 12m Talk		AutoConf : Automated Configuration of Unsupervised Learning Systems using Metamorphic Testing and Bayesian Optimization Research Papers Lwin Khin Shar Singapore Management University, Arda Goknil SINTEF Digital, Erik Johannes Husom SINTEF Digital, Sagar Sen , Yan Naing Tun Singapore Management University, Kisub Kim Singapore Management University, Singapore File Attached
14:18 12m Talk		An Intentional Forgetting-Driven Self-Healing Method For Deep Reinforcement Learning SystemsRecorded talk Research Papers Ahmed Haj Yahmed École Polytechnique de Montréal, Rached Bouchoucha Polytechnique Montréal, Houssem Ben Braiek Polytechnique Montréal, Foutse Khomh Polytechnique Montréal Pre-print Media Attached
14:30 12m Talk		A Majority Invariant Approach to Patch Robustness Certification for Deep Learning ModelsRecorded talk NIER Track Qilin Zhou City University of Hong Kong, Zhengyuan Wei City University of Hong Kong, Hong Kong, Haipeng Wang City University of Hong Kong, Wing-Kwong Chan City University of Hong Kong, Hong Kong Pre-print Media Attached