Mining software repositories is a useful technique for researchers and practitioners to see what software developers actually do when developing software. Tools like Boa provide users with the ability to easily mine these open-source software repositories at a very large scale, with datasets containing hundreds of thousands of projects. The trade-off is that users must use the provided infrastructure, query language, runtime, and datasets and this might not fit all analysis needs. In this work, we present Boidae: a family of Boa installations controlled and customized by users. Boidae uses automation tools such as Ansible and Docker to facilitate the deployment of a customized Boa installation. In particular, Boidae allows the creation of custom datasets generated from any set of Git repositories, with helper scripts to aid in finding and cloning repositories from GitHub and SourceForge. In this paper, we briefly describe the architecture of Boidae and how researchers can utilize the infrastructure to generate custom datasets. Boidae’s scripts and all infrastructure it builds upon are open-sourced. A video demonstration of Boidae’s installation and extension is available at https://go.unl.edu/boidae.
Thu 18 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Analytics 3Research Track / Journal-first Papers / Demonstrations at Maria Helena Vieira da Silva Chair(s): Sridhar Chimalakonda Indian Institute of Technology, Tirupati | ||
14:00 15mTalk | Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem Research Track Yun Peng The Chinese University of Hong Kong, Ruida Hu Harbin Institute of Technology, Shenzhen, Ruoke Wang Harbin Institute of Technology, Shenzhen, Cuiyun Gao Harbin Institute of Technology, Shuqing Li The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong | ||
14:15 15mTalk | Data-Driven Evidence-Based Syntactic Sugar Design Research Track David OBrien Iowa State University, Robert Dyer University of Nebraska-Lincoln, Tien N. Nguyen University of Texas at Dallas, Hridesh Rajan Iowa State University | ||
14:30 15mTalk | Revisiting Android App Categorization Research Track Marco Alecci University of Luxembourg, Jordan Samhi CISPA Helmholtz Center for Information Security, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg | ||
14:45 15mTalk | Are Your Requests Your True Needs? Checking Excessive Data Collection in VPA App Research Track Fuman Xie University of Queensland, Chuan Yan University of Queensland, Mark Huasong Meng National University of Singapore, Shaoming Teng The University of Queensland, Yanjun Zhang Deakin University, Guangdong Bai University of Queensland | ||
15:00 7mTalk | Acrobats and Safety-Nets: Problematizing Large-Scale Agile Software Development Journal-first Papers Knut Rolland University of Oslo, Brian Fitzgerald Lero - The Irish Software Research Centre and University of Limerick, Torgeir Dingsøyr Norwegian University of Science and Technology and SimulaMet, Klaas-Jan Stol Lero; University College Cork; SINTEF Digital Link to publication DOI | ||
15:07 7mTalk | Program Transformation Landscapes for Automated Program Modification Using Gin: Extended Abstract Journal-first Papers Justyna Petke University College London, Brad Alexander University of Adelaide, Earl T. Barr University College London, Alexander E.I. Brownlee University of Stirling, Markus Wagner Monash University, Australia, David R. White University of Sheffield | ||
15:14 7mTalk | Boidae: Your Personal Mining Platform Demonstrations Brian Sigurdson Bowling Green State University, Samuel W. Flint University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln Pre-print Media Attached | ||
15:21 7mTalk | Code Mapper: Mapping the Global Contributions of OSS Demonstrations Thomas Le Tourneau CY Tech, Jasmine Latendresse Concordia University, Ahmad Abdellatif University of Calgary, Emad Shihab Concordia University |