Property-based Testing within ML Projects: an Empirical Study (ICSME 2024 - New Ideas and Emerging Results Track)

Who

Cindy Wauters, Coen De Roover

Track

ICSME 2024 New Ideas and Emerging Results Track

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 9 Oct 2024 15:45 - 15:55 at Fremont - Session 6: Maintenance of AI-based Systems Chair(s): Sujoy Roychowdhury

Abstract

In property-based testing (PBT), developers specify properties that they expect the system under test to hold. The PBT tool generates random inputs for the system and tests for each of these inputs whether the given property holds. An advantage of this approach over testing a set of manually defined example inputs is that it enables a higher code coverage.

Machine learning (ML) projects, however, often have to process large amounts of diverse data, both for training a model and afterwards, when the trained model is deployed. Generating a sufficient amount of diverse data for the property-based tests is therefore challenging.

In this paper, we present the results of a preliminary study in which we examined a dataset of 58 open-source ML projects that have dependencies on the popular PBT library Hypothesis, to identify issues faced by developers writing property-based tests. For a subset of 28 open-source ML projects, we study the property-based tests in detail and report on the part of the ML project that is being tested as well as on the adopted data generation strategies. This way, we aim to identify issues in porting current PBT techniques to ML projects so that they can be addressed in the future.

Link to Preprint

https://researchportal.vub.be/en/publications/property-based-testing-within-ml-projects-an-empirical-study

Cindy Wauters

Vrije Universiteit Brussel

Belgium

Coen De Roover