Tests from Traces: Automated Unit Test Extraction for RISSTA paper
Unit tests are labor-intensive to write and maintain. This paper looks into the possibility of automatically generating tests for a software package from the execution traces of clients of that package. Our objectives are to reduce the effort in creating test suites, minimizing the number and size of individual tests while maximizing coverage. To evaluate the viability of our approach we selected a challenging target for automated test genration, namely R, a programming language that is popular for data science applications. The challenges presented by R are its extreme dynamism and the lack of types. The combination of these decrease the efficacy of traditional test generation techniques. We present Genthat, a tool that we have developed over the last couple of years to, non- invasively, record execution traces of R programs and extract unit tests from those traces. We have carried out an evaluation on 1.7M lines of R code. The unit tests generated by Genthat improved code coverage on average from 267,113 lines to 704,450 lines.