TerzoN: Human-in-the-Loop Software Testing with a Composite Oracle (FSE 2025 - Research Papers)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

co-located with ISSTA 2025

Who

Matthew C. Davis, Amy Wei, Brad A. Myers, Joshua Sunshine

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 24 Jun 2025 15:00 - 15:20 at Cosmos 3C - Testing 3 Chair(s): Dan Hao

Abstract

Software testing is difficult, tedious, and costs an estimated $48–87 billion USD/year in US labor. Automatic test generation tools aim to ease this burden but have important trade-offs. Fuzzers use an implicit oracle that can detect obviously invalid results. However, there is no general solution to the oracle problem, and an implicit oracle cannot automatically evaluate correctness. Test suite generators like EvoSuite use the program under test as the oracle and therefore cannot evaluate correctness. Property-based testing (PBT) tools evaluate correctness, but users find it difficult to come up with properties to test, and to understand whether the properties are correct. Consequently, adoption of these tools has been narrow, and test suites continue to be created manually by practitioners, who often use an example-based oracle to specify correct input and output examples.

To help bridge the gaps among oracles and tools, we present a Composite Oracle that incorporates implicit, property-based, and example-based oracles. To help us understand the practical properties of a Composite Oracle, we built TerzoN, an Automatic Test sUite Generator (ATUG) that implements a Composite Oracle. TerzoN displays all the test results in an integrated view composed from the results of the 3 types of oracles and finds some types of test assertion inconsistencies that might otherwise lead to misleading test results. We evaluated TerzoN with its Composite Oracle in a randomized controlled human trial with 14 professional software engineers using a popular industry tool, fast-check, as the control. Participants using TerzoN elicited 72% more bugs (p<0.01), accurately described more than twice the number of bugs (p<0.01) and tested 16% more quickly (p<0.05).

Link to Publication

https://dl.acm.org/doi/10.1145/3729359

DOI

https://doi.org/10.1145/3729359

Matthew C. Davis

Carnegie Mellon University

United States

Amy Wei

University of Michigan

Brad A. Myers

Carnegie Mellon University

Joshua Sunshine

Carnegie Mellon University

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 24 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30	Testing 3Research Papers / Industry Papers / Ideas, Visions and Reflections / Journal First at Cosmos 3C Chair(s): Dan Hao Peking University

14:00 20m Talk		LlamaRestTest: Effective REST API Testing with Small Language Models Research Papers Myeongsoo Kim Georgia Institute of Technology, Saurabh Sinha IBM Research, Alessandro Orso Georgia Institute of Technology DOI
14:20 20m Talk		Testing Updated Apps by Adapting Learned Models Journal First Chanh Duc Ngo University of Luxembourg, Fabrizio Pastore University of Luxembourg, Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland Link to publication
14:40 20m Talk		Automated Testing of COBOL to Java Transformation Industry Papers Sandeep Hans IBM India Research Lab, Atul Kumar IBM Research India, Toshiaki Yasue IBM Research - Tokyo, Kohichi Ono IBM Research - Tokyo, Saravanan Krishnan IBM India Research Lab, Devika Sondhi IBM Research, Fumiko Satoh IBM Research - Tokyo, Gerald Mitchell IBM Software, Sachin Kumar IBM Software, Diptikalyan Saha IBM Research India
15:00 20m Talk		TerzoN: Human-in-the-Loop Software Testing with a Composite Oracle Research Papers Matthew C. Davis Carnegie Mellon University, Amy Wei University of Michigan, Brad A. Myers Carnegie Mellon University, Joshua Sunshine Carnegie Mellon University Link to publication DOI
15:20 10m Talk		Efficient Test Generation for Dynamic Behaviors Leveraging Token-Level Input Commonalities Ideas, Visions and Reflections Yuxin Qiu University of California at Riverside, Qian Zhang University of California at Riverside

Information for Participants

Tue 24 Jun 2025 14:00 - 15:30 at Cosmos 3C - Testing 3 Chair(s): Dan Hao

Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.