Assessing the Early Bird Heuristic (for Predicting Project Quality)
Before researchers rush to reason across all available data or try complex methods, perhaps it is prudent to first check for simpler alternatives. Specifically, if the historical data has the most information in some small region, perhaps a model learned from that region would suffice for the rest of the project.
To support this claim, we offer a case study with 240 projects, where we find that the information in those projects ``clumps'' towards the earliest parts of the project. A quality prediction model learned from just the first 150 commits works as well, or better than state-of-the-art alternatives. Using just this “early-bird” data, we can build models very quickly and very early in the project life cycle. Moreover, using this early-bird method, we have shown that a simple model (with just a few features) generalizes to hundreds of projects.
Based on this experience, we doubt that prior work on generalizing quality models may have needlessly complicated an inherently simple process. Further, prior work that focused on later-life cycle data needs to be revisited since their conclusions were drawn from relatively uninformative regions.
Replication note: all our data and scripts are available here: https://github.com/snaraya7/early-bird
Fri 19 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Analytics 5Research Track / Journal-first Papers at Amália Rodrigues Chair(s): Sridhar Chimalakonda Indian Institute of Technology, Tirupati | ||
14:00 15mTalk | An Exploratory Investigation of Log Anomalies in Unmanned Aerial Vehicles Research Track Dinghua Wang , Shuqing Li The Chinese University of Hong Kong, Guanping Xiao Nanjing University of Aeronautics and Astronautics, Yepang Liu Southern University of Science and Technology, Yulei Sui UNSW, Pinjia He Chinese University of Hong Kong, Shenzhen, Michael Lyu The Chinese University of Hong Kong | ||
14:15 15mTalk | ModuleGuard: Understanding and Detecting Module Conflicts in Python Ecosystem Research Track Ruofan Zhu Zhejiang University, Xingyu Wang Zhejiang University, Chengwei Liu Nanyang Technological University, Zhengzi Xu Nanyang Technological University, Wenbo Shen Zhejiang University, China, Rui Chang Zhejiang University, Yang Liu Nanyang Technological University | ||
14:30 15mTalk | Empirical Analysis of Vulnerabilities Life Cycle in Golang Ecosystem Research Track Jinchang Hu , Lyuye Zhang Nanyang Technological University, Chengwei Liu Nanyang Technological University, Sen Yang Academy of Military Science, Song Huang Army Engineering University of PLA, Yang Liu Nanyang Technological University | ||
14:45 15mTalk | Fine-SE: Integrating Semantic Features and Expert Features for Software Effort Estimation Research Track Yue Li Nanjing University, Zhong Ren State Key Laboratory of Novel Software Technology, Software Institute, Nanjing University Nanjing, Jiangsu, China, Zhiqi Wang State Key Laboratory of Novel Software Technology, Software Institute, Nanjing University Nanjing, Jiangsu, China, Lanxin Yang Nanjing University, Liming Dong Nanjing University, He Zhang Nanjing University | ||
15:00 7mTalk | Concretization of Abstract Traffic Scene Specifications Using Metaheuristic Search Journal-first Papers Aren Babikian McGill University, Oszkár Semeráth Budapest University of Technology and Economics, Daniel Varro Linköping University / McGill University | ||
15:07 7mTalk | Technical leverage analysis in the Python ecosystem Journal-first Papers Ranindya Paramitha University of Trento, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam | ||
15:14 7mTalk | Automated Mapping of Adaptive App GUIs from Phones to TVs Journal-first Papers Han Hu Faculty of Information Technology, Monash University, ruiqi dong Swinburne University of Technology, John Grundy Monash University, Thai Minh Nguyen Monash University, huaxiao liu Jilin University, Chunyang Chen Technical University of Munich (TUM) Link to publication DOI Pre-print | ||
15:21 7mTalk | Assessing the Early Bird Heuristic (for Predicting Project Quality) Journal-first Papers Link to publication DOI Pre-print |