This program is tentative and subject to change.
Data analysts need to be careful when they apply statistical inference techniques to data, as misuse of statistical inference methods can lead an analyst to draw the wrong conclusions. They need to be careful because, in the general case, misuse of statistics does not result in obvious problems; the numbers returned often look reasonable, and programs with misuses of statistics do not crash. In this work, we propose a technique to quickly and statically check data science programs for compliance with statistics best practice rules, including checking all assumptions made by statistical methods, as well as correcting for the multiple comparison problem, or “data dredging”. This technique is predicated on a novel statistics intermediate representation, called SIR, that encodes the details most salient to statistics. We implement this technique in a tool called stat-lint, the first statistics linter, and evaluate stat-lint on 45 Python data science notebooks, finding that only 11 fully check all obligations, only two apply any correction for multiple comparisons, and over half of obligations go unchecked.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
| 11:00 - 12:30 | |||
| 11:0010m Talk | The Fault in our Stats Research Papers | ||
| 11:1010m Talk | Agents in the Sandbox: End-to-End Crash Bug Reproduction for Minecraft Research Papers Eray Yapağcı Bilkent University, Yavuz Alp Sencer Öztürk Bilkent University, Eray Tüzün Bilkent University | ||
| 11:2010m Talk | Finding Bugs in MLIR Compiler Infrastructure via Lowering Space Exploration Research Papers Jingjing Liang East China Normal University, Shan Huang East China Normal University, Ting Su East China Normal University | ||
| 11:3010m Talk | Why Do Machine Learning Notebooks Crash? An Empirical Study on Public Python Jupyter Notebooks Journal-First Track Yiran Wang Linköping University, Willem Meijer Linköping University, José Antonio Hernández López Universidad de Murcia, Ulf Nilsson Linköping University, Daniel Varro Linköping University / McGill University | ||
| 11:4010m Talk | When AllClose Fails: Round-Off Error Estimation for Deep Learning Programs Research Papers Qi Zhan Zhejiang University, Xing Hu Zhejiang University, Yuanyi Lin Huawei Technologies, Tongtong Xu Huawei, Xin Xia Zhejiang University, Shanping Li Zhejiang University | ||
| 11:5010m Talk | LLM-Powered Multi-Agent Collaboration for Intelligent Industrial On-Call Automation Research Papers Ruowei Fu Nankai University, Yang Zhang ByteDance Inc., Zeyu Che Nankai University, Xin Wu ByteDance Inc., Zhenyu Zhong Nankai University, Zhiqiang Ren ByteDance Inc., Shenglin Zhang Nankai University, Feng Wang ByteDance Inc., Yongqian Sun Nankai University, Xiaozhou Liu ByteDance Inc., Kexin Liu Nankai University, Yu Zhang ByteDance Inc. | ||
| 12:0010m Talk | SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking Research Papers Zewei Lin Sun Yat-sen University, Jiachi Chen Sun Yat-sen University, Jingwen Zhang School of Software Engineering, Sun Yat sen University, Zexu Wang Sun Yat-sen University, Yuming Feng Peng Cheng Laboratory, Weizhe Zhang Harbin Institute of Technology, Zibin Zheng Sun Yat-sen University | ||
| 12:1010m Talk | Finding Bugs in WebAssembly Interface Type Binding Generators Research Papers | ||
| 12:2010m Talk | LineBreaker: Finding Token-Inconsistency Bugs using Large Language Models Research Papers Hongbo Chen Indiana University Bloomington, Yifan Zhang San Diego State University, Xing Han The Hong Kong University of Science and Technology, Tianhao Mao Indiana University, Huanyao Rong Indiana University Bloomington, Yuheng Zhang Tsinghua University, Hang Zhang Indiana University, XiaoFeng Wang ACM member, Luyi Xing Indiana University Bloomington/University of Illinois Urbana-Champaign, Xun Chen Samsung Research America | ||
