APSEC 2022
Tue 6 - Fri 9 December 2022
Fri 9 Dec 2022 13:00 - 13:20 at Room2 - Machine Learning 3 Chair(s): Atul Gupta

Reinforcement learning has been used to solve sequential decision-making problems in intelligent systems. However, current RL approaches suffer from slow convergence and reward sparsity, and its reward mechanism is challenging to deal with complex task specifications. Temporal logic can describe non-Markovian task specifications, the synthesized strategy of which could be used as a priori knowledge to train the agents to interact with the environment efficiently. This paper considers the intelligent agent reacts to the environment with a high-level reactive temporal logic specification called Generalized Reactivity of rank 1 (GR(1)). We first use the synthesized strategy of GR(1) to construct the Markov Decision Process with a potential-based reward machine, which integrates the environment with high-level reactive temporal specifications. Then we developed a topological-sort-based reward shaping approach to calculate the potential functions of the reward machine, based on which we used Q-learning to train the agents. Experiments on multi-task learning show that the proposed approach outperforms the state-of-art algorithms in learning rate and optimal rewards. Also, compared with the value-iteration-based reward shaping approaches, our topological-sort-based reward shaping approach could handle the cases where the synthesized strategies are in the form of directed cyclic graphs.

Fri 9 Dec

Displayed time zone: Osaka, Sapporo, Tokyo change

13:00 - 14:00
Machine Learning 3Technical Track at Room2
Chair(s): Atul Gupta Indian Institute of Information Technology, Design and Manufacturing (IIITDM)
13:00
20m
Paper
Efficient Reinforcement Learning with Generalized-Reactivity Specifications
Technical Track
Chenyang Zhu , Yujie Cai Changzhou University, Can Hu changzhou university, Jia Bi University of Southampton
13:20
20m
Paper
Adversarial Deep Reinforcement Learning for Improving the Robustness of Multi-agent Autonomous Driving Policies
Technical Track
Aizaz Sharif Simula Research Laboratory, Dusica Marijan Simula
13:40
20m
Paper
DronLomaly: Runtime Detection of Anomalous Drone Behaviors via Log Analysis and Deep Learning
Technical Track
Lwin Khin Shar Singapore Management University, Wei Minn Singapore Management University, Duong Ta Singapore Management University, Jiani Fan Nanyang Technological University, Lingxiao Jiang Singapore Management University, Daniel Lim Wai Kiat Singapore Management University