Write a Blog >>
ICSE 2022
Sun 8 - Fri 27 May 2022
Thu 12 May 2022 11:00 - 11:05 at ICSE room 4-odd hours - Software Testing 13 Chair(s): Peter C. Rigby
Thu 12 May 2022 22:05 - 22:10 at ICSE room 2-even hours - Release Engineering and DevOps 2 Chair(s): Xin Peng

Testing is expensive and batching tests has the potential to reduce test costs. The continuous integration strategy of testing each commit or change individually helps to quickly identify faults but leads to a maximal number of test executions. Large companies that have a massive number of commits, e.g., Google and Facebook, or have expensive test infrastructure, e.g., Ericsson, must batch changes together to reduce the number of total test runs. For example, if eight builds are batched together and there is no failure, then we have tested eight builds with one execution saving seven executions. However, when a failure occurs it is not immediately clear which build is the cause of the failure. A bisection is run to isolate the failing build, i.e. the culprit build. In our eight builds example, a failure will require an additional 6 executions, resulting in a saving of one execution.

In this work, we re-evaluate batching approaches developed in industry on large open source projects using Travis CI. We also introduce novel batching approaches. In total, we evaluate six approaches. The first is the baseline approach that tests each build individually. The second, is the existing bisection approach. The third uses a batch size of four, which we show mathematically reduces the number of execution without requiring bisection. The fourth combines the two prior techniques introducing a stopping condition to the bisection. The final two approaches use models of build change risk to isolate risky changes and test them in smaller batches. We find that compared to the TestAll baseline, on average, the approaches reduce the number of build test executions across projects by 46%, 48%, 50%, 44%, and 49% for BatchBisect, Batch4, BatchStop4, RiskTopN, and RiskBatch, respectively. The greatest reduction in executions is BatchStop4 at 50%. However, the simple approach of Batch4 does not require bisection and achieves a reduction of 48%. In a larger sample of projects, we find that a project’s failure rate is strongly correlated with execution savings (Spearman r = −0.97 with a p << 0.001). Using Batch4, 85% of projects see savings. All projects that have build failures less than 40% of the time will benefit from batching. In terms of feedback time, compared to TestAll, we find that BatchBisect, Batch2, Batch4, BatchStop4 all reduce the average feedback time by 33%, 16%, 32%, and 37%. Simple batching saves not only resources but also reduces feedback time without introducing any slip-throughs and without changing the test run order.

We suggest that most projects should adjust their CI pipelines to use a batch size of at least two. We release our scripts and data for replication as well as the BatchBuilder tool that automatically batches submitted changes on GitHub for testing on Travis CI. Since the tool reports individual results for each pull-request or pushed commit, the batching happens in the background and the development process is unchanged.

Thu 12 May

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:00
11:00
5m
Talk
Software Batch Testing to Save Build Test Resources and to Reduce Feedback Time
Journal-First Papers
Mohammad Javad Beheshtian Concordia University, Amir Bavand Concordia University, Peter Rigby Concordia University, Montreal, Canada
Link to publication DOI Media Attached
11:05
5m
Talk
A Family of Experiments on Test-Driven Development
Journal-First Papers
Adrian Santos Parrilla University of Oulu, Sira Vegas Universidad Politecnica de Madrid, Oscar Dieste Universidad Politécnica de Madrid, Fernando Uyaguari ETAPA Telecommunications Company, Ayse Tosun Istanbul Technical University, Davide Fucci Blekinge Institute of Technology, Burak Turhan University of Oulu, Giuseppe Scanniello University of Basilicata, Simone Romano University of Bari, Itir Karac University of Oulu, Marco Kuhrmann Reutlingen University, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad, Robert Ramač Faculty of Technical Sciences, University of Novi Sad, Dietmar Pfahl University of Tartu, Christian Engblom Ericsson, Jarno Kyykka Ericsson, Kerli Rungi Testlio, Carolina Palomeque ETAPA Telecommunications Company, Jaroslav Spisak PAF, Markku Oivo University of Oulu, Natalia Juristo Universidad Politecnica de Madrid
Link to publication DOI Pre-print Media Attached
11:10
5m
Talk
Prioritizing Mutants to Guide Mutation Testing
Technical Track
Samuel Kaufman University of Washington, Ryan Featherman University of Washington, Justin Alvin University of Massachusetts Amherst, Bob Kurtz George Mason University, USA, Paul Ammann George Mason University, USA, René Just University of Washington
DOI Pre-print Media Attached
11:15
5m
Talk
Automated Testing of Software that Uses Machine Learning APIs
Technical Track
Chengcheng Wan The University of Chicago, Shicheng Liu University of Chicago, Sophie Xie University of California, Berkeley, Yifan Liu University of Chicago, Henry Hoffmann University of Chicago, Michael Maire University of Chicago, Shan Lu University of Chicago
Pre-print Media Attached
11:20
5m
Talk
CONFETTI: Amplifying Concolic Guidance for Fuzzers
Technical Track
James Kukucka George Mason University, Luís Pina University of Illinois at Chicago, Paul Ammann George Mason University, USA, Jonathan Bell Northeastern University
Pre-print Media Attached
11:25
5m
Talk
On the Reliability of Coverage-Based Fuzzer Benchmarking
Technical Track
Marcel Böhme MPI-SP, Germany and Monash University, Australia, Laszlo Szekeres Google, Jonathan Metzman Google
DOI Pre-print Media Attached
22:00 - 23:00
22:00
5m
Talk
An Empirical Study on Release Notes Patterns of Popular Apps in the Google Play Store
Journal-First Papers
Aidan Z.H. Yang Carnegie Mellon University, Safwat Hassan Thompson Rivers University, Ying Zou Queen's University, Kingston, Ontario, Ahmed E. Hassan Queen's University
Link to publication DOI Pre-print Media Attached
22:05
5m
Talk
Software Batch Testing to Save Build Test Resources and to Reduce Feedback Time
Journal-First Papers
Mohammad Javad Beheshtian Concordia University, Amir Bavand Concordia University, Peter Rigby Concordia University, Montreal, Canada
Link to publication DOI Media Attached
22:10
5m
Talk
DevOps Education: An Interview Study of Challenges and Recommendations
SEET - Software Engineering Education and Training
Marcelo Fernandes Federal Institute of Rio Grande do Norte, Samuel Ferino Federal University of Rio Grande do Norte, Anny Fernandes Federal University of Rio Grande do Norte, Uirá Kulesza Federal University of Rio Grande do Norte, Eduardo Aranha Federal University of Rio Grande do Norte, Christoph Treude University of Melbourne
Pre-print Media Attached
22:15
5m
Talk
Lessons from Eight Years of Operational Data from a Continuous Integration Service: A Case Study of CircleCINominated for Distinguished Paper
Technical Track
Keheliya Gallaba McGill University, Maxime Lamothe Polytechnique Montréal, Shane McIntosh University of Waterloo
Pre-print Media Attached
22:20
5m
Talk
Towards Language-independent Brown Build Detection
Technical Track
Doriane Olewicki Polytechnique Montréal, Mathieu Nayrolles Ubisoft Montreal, Bram Adams Queen's University, Kingston, Ontario
Link to publication Media Attached

Information for Participants
Thu 12 May 2022 11:00 - 12:00 at ICSE room 4-odd hours - Software Testing 13 Chair(s): Peter C. Rigby
Info for room ICSE room 4-odd hours:

Click here to go to the room on Midspace

Thu 12 May 2022 22:00 - 23:00 at ICSE room 2-even hours - Release Engineering and DevOps 2 Chair(s): Xin Peng
Info for room ICSE room 2-even hours:

Click here to go to the room on Midspace