Metamorphic Testing of Deep Reinforcement Learning Agents with MDPMORPH
We present MDPMORPH, a tool for metamorphic testing of Deep Reinforcement Learning (DRL) agents. MDPMORPH is based on the Markov Decision Process (MDP) and targets the core reasoning properties of DRL agents to automatically uncover potential faults. It can generate metamorphic test suites and corresponding mutants directly from the DRL system under test. MDPMORPH uses a subset of the metamorphic test suite and models to train the thresholds of the nine proposed Metamorphic Relations (MRs) using Stochastic Gradient Descent. These MRs are based on the temporal characteristics of the Markov Decision Process (MDP), and the training aims to determine the optimal threshold for each MR. After obtaining the optimal threshold, MDPMORPH leverages the MRs to compare the execution results of different metamorphic test suites on the model under test and reports whether each test passes or fails. Finally, by collecting the execution results, MDPMORPH calculates the mutant detection rate of MR to validate its effectiveness. Experimental results show that MDPMORPH and the proposed MRs are highly effective in automatically detecting seeded faults (mutants).