The past few years have witnessed the proliferation of quantum software stacks (QSS) developed in response to rapid hardware advances in quantum computing. A QSS includes a quantum programming language, an optimizing compiler that compiles a quantum algorithm expressed in a high-level language into quantum gate instructions, a quantum simulator that emulates these instructions on a classical device, the control software that turns circuits into analog signals sent to the quantum computer, and execution on very expensive quantum hardware. In comparison to traditional compilers and architecture simulators, QSSes are difficult to tests due to the probabilistic nature of results, the lack of clear hardware specifications, and quantum programming complexity. This work devises a novel differential testing approach for QSSes, named QDiff with three major innovations: (1) We generate input programs to be tested via semantics-preserving, source to source transformation to explore program variants. (2) We speed up differential testing by filtering out quantum circuits that are not worthwhile to execute on quantum hardware by analyzing static characteristics such as circuit depth, 2-gate operations, gate error rates, and T1 relaxation time. (3) We design an extensible equivalence checking mechanism via distribution comparison functions such as Kolmogorov–Smirnov test and cross entropy.
We evaluate QDiff with three widely-used open source QSSes: Qiskit from IBM, Cirq from Google, and Pyquil from Rigetti. By running \tool on both real hardware and quantum simulators, we found several critical bugs revealing potential instabilities in these platforms. QDiff’s source transformation is effective in producing semantically equivalent yet not-identical circuits (i.e., 34% of trials), and its filtering mechanism can speed up differential testing by 66%.