Using Active Learning to Find High-Fidelity Builds (MSR 2022 - Technical Papers)

Who

Harshitha Menon, Konstantinos Parasyris, Todd Gamblin, Tom Scogland

Track

MSR 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 18 May 2022 13:21 - 13:28 at MSR Main room - odd hours - Session 4: Software Quality (Bugs & Smells) Chair(s): Maxime Lamothe, Mahmoud Alfadel

Abstract

Modern software is incredibly complex. A typical application may comprise hundreds or thousands of reusable components. Auto-mated package managers can help to maintain a consistent set of dependency versions, but ultimately the solvers in these systems rely on constraints generated by humans. At scale, small errors add up, and it becomes increasingly difficult to find high-fidelity configurations. We cannot test all configurations, because the space is combinatorial, so exhaustive exploration is infeasible.In this paper, we present Reliabuild, an auto-tuning framework that efficiently explores the build configuration space and learns which package versions are likely to result in a successful configuration. We implement two models in Reliabuild to rank the different configurations and use adaptive sampling to select good configurations with fewer samples. We demonstrate the effectiveness of Reliabuildby evaluating 31,186 build configurations of 61 packages from the Extreme-scale Scientific Software Stack (E4S), and we show that Reliabuild selects good configurations efficiently. For example,Reliabuildselects3×the number of good configurations in comparison to random sampling for several packages including Abyss, Bolt, libnrm, OpenMPI. Our framework is also able to select all the high fidelity builds in half the number of samples required by random sampling for packages such as Chai, OpenMPI,py-petsc4py, and slepc. We further use the model to learn statistics about the compatibility of different packages, which will enable package solvers to better select high-fidelity build configurations automatically.

Link to Preprint

http://www.harshithamenon.com/papers/Reliabuild-MSR-preprint.pdf

Harshitha Menon

Lawrence Livermore National Lab

Konstantinos Parasyris

Lawrence Livermore National Laboratory

Todd Gamblin

Lawrence Livermore National Laboratory

Tom Scogland

Lawrence Livermore National Laboratory

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 18 May
Displayed time zone: Eastern Time (US & Canada) change

13:00 - 13:50	Session 4: Software Quality (Bugs & Smells)Data and Tool Showcase Track / Technical Papers at MSR Main room - odd hours Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Mahmoud Alfadel University of Waterloo

13:00 7m Talk		Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue Technical Papers Rui Shu North Carolina State University, Tianpei Xia North Carolina State University, Laurie Williams North Carolina State University, Tim Menzies North Carolina State University
13:07 7m Talk		To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set? Technical Papers Matteo Ciniselli Università della Svizzera Italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print
13:14 7m Talk		How to Improve Deep Learning for Software Analytics (a case study with code smell detection) Technical Papers Rahul Yedida , Tim Menzies North Carolina State University Pre-print
13:21 7m Talk		Using Active Learning to Find High-Fidelity Builds Technical Papers Harshitha Menon Lawrence Livermore National Lab, Konstantinos Parasyris Lawrence Livermore National Laboratory, Todd Gamblin Lawrence Livermore National Laboratory, Tom Scogland Lawrence Livermore National Laboratory Pre-print
13:28 4m Talk		ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction Data and Tool Showcase Track Hossein Keshavarz David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada, Mei Nagappan University of Waterloo Pre-print
13:32 4m Talk		ReCover: a Curated Dataset for Regression Testing Research Data and Tool Showcase Track Francesco Altiero Università degli Studi di Napoli Federico II, Anna Corazza Università degli Studi di Napoli Federico II, Sergio Di Martino Università degli Studi di Napoli Federico II, Adriano Peron Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II
13:36 14m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Wed 18 May 2022 13:00 - 13:50 at MSR Main room - odd hours - Session 4: Software Quality (Bugs & Smells) Chair(s): Maxime Lamothe, Mahmoud Alfadel

Info for room MSR Main room - odd hours:

Click here to go to the room on Midspace