Expressing and Checking Statistical Assumptions (FSE 2025 - Research Papers)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

Who

Alexi Turcotte, Zheyuan Wu

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 14:40 - 15:00 at Cosmos 3C - Empirical Studies 2 Chair(s): Yuchao Jiang

Abstract

Literate programming environments like Jupyter and R Markdown notebooks, coupled with easy-to-use languages like Python and R, put a plethora of statistical methods right at a data analyst’s fingertips. But are these methods being used correctly? Statistical methods make statistical assumptions about samples being analyzed, and in many cases produce reasonable looking results even if assumptions are not met.

We propose an approach that allows library developers to annotate functions with statistical assumptions, phrases them as hypotheses about the data, and inserts hypothesis tests investigating the likelihood that the assumption is met. As a proof of concept, we implement this approach in two tools: prob-check-py for Python, and prob-check-r for R. To evaluate these, we identify common hypothesis testing and statistical modeling functions in Python and R, annotate them with the relevant statistical assumptions, and run 128 Kaggle notebooks that use those methods to identify misuses. Our investigation reveals that at least one statistical assumption was violated in 84.38% of surveyed notebooks, and that assumptions were violated in 53.36% of calls to annotated functions. Moreover, had the appropriate hypothesis testing method been chosen given the characteristics of the data, a different conclusion would have been drawn in 11.51% of cases.

DOI

https://doi.org/10.1145/3729391

Alexi Turcotte

CISPA

Germany

Zheyuan Wu

Saarland University

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30	Empirical Studies 2Ideas, Visions and Reflections / Research Papers / Journal First at Cosmos 3C Chair(s): Yuchao Jiang UNSW

14:00 20m Talk		An Empirical Analysis of Issue Templates Usage in Large-Scale Projects on GitHub Journal First Emre Sülün Bilkent University, Metehan Saçakcı Bilkent University, Eray Tüzün Bilkent University
14:20 20m Talk		The Landscape of Toxicity: An Empirical Investigation of Toxicity on GitHub Research Papers Jaydeb Sarker University of Nebraska at Omaha, Asif Kamal Turzo Wayne State University, Amiangshu Bosu Wayne State University DOI Pre-print
14:40 20m Talk		Expressing and Checking Statistical Assumptions Research Papers Alexi Turcotte CISPA, Zheyuan Wu Saarland University DOI
15:00 20m Talk		Why the Proof Fails in Different Versions of Theorem Provers: An Empirical Study of Compatibility Issues in Isabelle Research Papers Xiaokun Luan Peking University, David Sanan Singapore Institute of Technology, Zhe Hou Griffith University, Qiyuan Xu Nanyang Technological University, Chengwei Liu Nanyang Technological University, Yufan Cai National University of Singapore, Yang Liu Nanyang Technological University, Meng Sun Peking University DOI
15:20 10m Talk		Missing Threats: Dealing with the Treatment-sensitive Factorial Structure Bias in Empirical Software Engineering Ideas, Visions and Reflections Sabato Nocera University of Salerno, Giuseppe Scanniello University of Salerno

Information for Participants

Wed 25 Jun 2025 14:00 - 15:30 at Cosmos 3C - Empirical Studies 2 Chair(s): Yuchao Jiang

Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.