Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022

Call for Papers

The MSR Data/Tool Showcase track aims to actively promote and recognize the creation of reusable datasets and tools that are designed and built not only for a specific research project, but for the MSR community as a whole. These datasets and tools should enable other practitioners and researchers to jumpstart their own research efforts, and also enable the reproducibility of earlier work. The MSR Data/Tool Showcase papers can be descriptions of datasets or tools built by the authors that can be used by other practitioners or researchers, and/or descriptions of the use of tools built by others to obtain specific research results.

Types of MSR’22 Data and Tool Showcase Track Submission

MSR’22 Data/Tool Showcase Track will accept two types of submissions: (1) data showcase papers and (2) reusable tool showcase papers.

The authors should prepare submissions with a maximum of 4 pages, plus 1 additional page of references. Submissions should be submitted to the HotCRP submission site on or before Thursday 27th January 2022.

The Review Criteria for the Data/Tool Showcase submissions are as follows:

  • The value, usefulness, and reusability of the datasets or tools.
  • The quality of the presentation.
  • The clarity of relation with related work and its relevance to mining software repositories.
  • The availability of the datasets or tools.

1. Data Showcase

MSR Data showcase submissions are expected to include:

  • A description of the data source,
  • A description of the methodology used to gather the data (including provenance and the tool used to create/generate/gather the data, if any),
  • A description of the storage mechanism, including a schema if applicable,
  • If the data has been used by the authors or others, a description of how this was done including references to previously published papers,
  • A description of the originality of the data set (that is, even if the data set has been used in a published paper, its complete description must be unpublished) and similar existing datasets (if any)
  • A description of the design of the tool, and how to use the tool in practice ideas for future research questions that could be answered using the data set,
  • Ideas for further improvements that could be made to the data set, and
  • Any limitations and/or challenges in creating or using the data set.

2. Reusable Tool Showcase

MSR Reusable Tool showcase submissions are expected to include:

  • A description of the tool, which includes the background, motivation, novelty, overall architecture, detailed design, and preliminary evaluation of the tool, as well as the link to download or access the tool.
  • A description of the design of the tool, how to use the tool in practice.
  • Clear installation instructions and example data set that allow the reviewers to run the tool.
  • If the tool has been used by the authors or others, a description of how the tool was used including references to previously published papers Ideas for future reusability of the tools
  • Any limitations of using the tools

The dataset/tool should be made available at the time of submission of the paper for review but will be considered confidential until publication of the paper. The dataset/tool should include detailed instructions about how to set up the environment (e.g., requirements.txt), how to use the datasets/tools (e.g., how to import the data or how to access the data once it has been imported, how to use the tool with a running example).

At a minimum, upon publication of the paper, the authors should archive the data or tool on a persistent repository that can provide a digital object identifier (DOI) such as zenodo.org, figshare.com, Archive.org, or institutional repositories. In addition, the DOI-based citation of the dataset or the tool should be included in the camera-ready version of the paper.

Data/Tool showcase submissions are not:

  • Empirical studies.
  • Datasets that are based on poorly explained or untrustworthy heuristics for data collection, or results of trivial application of generic tools.

If custom tools have been used to create the data set, we expect the paper to be accompanied by the source code of the tools, along with clear documentation on how to run the tools to recreate the data set. The tools should be open source, accompanied by an appropriate license; the source code should be citable, i.e., refer to a specific release and have a DOI. GitHub provides an easy way to make source code citable. If you cannot provide the source code or the source code clause is not applicable (e.g., because the data set consists of qualitative data), please provide a short explanation of why this is not possible.

Important Dates

  • Abstract Deadline: Tuesday 25th January 2022
  • Paper Deadline: Thursday 27th January 2022
  • Author Notification: March 8
  • Camera Ready Deadline: Late March

Submission

Please submit your data and tool paper(s) (maximum 4 pages, plus 1 additional page of references) via the HotCRP submission site on or before Thursday 27th January 2022.

Submitted papers will undergo single-blind peer review. We opt for single-blind peer review (as opposed to the double-blind peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such a reference is likely to disclose the authors’ identity.

To make research datasets and tools accessible and citable, we further encourage authors to attend to the FAIR rules, i.e., datasets and tools should be: Findable, Accessible, Interoperable, and Reusable.

All authors should use the official “ACM Primary Article Template”, as can be obtained from the ACM Proceedings Template page. LaTeX users should use the sigconf option, as well as the review (to produce line numbers for easy reference by the reviewers). To that end, the following LaTeX code can be placed at the start of the LaTeX document:

\documentclass[sigconf,review]{acmart}

\acmConference[MSR 2022]{MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories}{May 23–24, 2022}{Pittsburgh, PA, USA}

We encourage authors to upload their paper info early (the PDF can be submitted later). All submissions must adhere to the following requirements:

  • Submissions must not exceed the page limit (4 pages plus 1 additional page of references for short papers). The page limit is strict, and it will not be possible to purchase additional pages at any point in the process (including after acceptance).
  • Submissions must strictly conform to the ACM formatting instructions. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.

Any submission that does not comply with these requirements is likely to be desk rejected by the PC Chairs without further review. In addition, by submitting to the MSR Technical Track, the authors acknowledge that they are aware of and agree to be bound by the following policies:

  • The ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. In particular, papers submitted to MSR 2022 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2022. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases (including immediate rejection and reporting of the incident to ACM/IEEE). To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.
  • The authorship policy of the ACM and the authorship policy of the IEEE.

Upon notification of acceptance, all authors of accepted papers will be asked to fill a copyright form and will receive further instructions for preparing the camera-ready version of their papers. At least one author of each paper is expected to register and present the paper at the MSR 2022 conference. All accepted contributions will be published in the electronic proceedings of the conference.

For enquiries, please contact the MSR Data/Tool Co-Chairs at chakkrit@monash.edu and xin.xia@acm.org

Dates
Wed 18 May 2022
Thu 19 May 2022
Fri 20 May 2022
Mon 23 May 2022
Tue 24 May 2022
Tracks
MSR Data and Tool Showcase Track
MSR Hackathon
MSR Industry Track
MSR MIP Award
MSR Mining Challenge
MSR Registered Reports
MSR Technical Papers
MSR Tutorials
You're viewing the program in a time zone which is different from your device's time zone change time zone

Wed 18 May

Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:50
03:00
4m
Talk
An Alternative Issue Tracking Dataset of Public Jira Repositories
Data and Tool Showcase Track
Lloyd Montgomery Universität Hamburg, Clara Marie Lüders University of Hamburg, Walid Maalej University of Hamburg
Pre-print Media Attached
05:00 - 05:50
Session 3: Introspection, Vision, and Human Aspects Technical Papers / Data and Tool Showcase Track / Industry Track / Registered Reports at MSR Main room - odd hours
Chair(s): Alexander Serebrenik Eindhoven University of Technology, Sebastian Baltes SAP SE & University of Adelaide
05:11
4m
Talk
The General Index of Software Engineering Papers
Data and Tool Showcase Track
Zeinab Abou Khalil Inria, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
13:00 - 13:50
Session 4: Software Quality (Bugs & Smells)Data and Tool Showcase Track / Technical Papers at MSR Main room - odd hours
Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Mahmoud Alfadel University of Waterloo
13:28
4m
Talk
ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction
Data and Tool Showcase Track
Hossein Keshavarz David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada, Mei Nagappan University of Waterloo
Pre-print
13:32
4m
Talk
ReCover: a Curated Dataset for Regression Testing Research
Data and Tool Showcase Track
Francesco Altiero Università degli Studi di Napoli Federico II, Anna Corazza Università degli Studi di Napoli Federico II, Sergio Di Martino Università degli Studi di Napoli Federico II, Adriano Peron Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II
14:00 - 14:50
Session 5: Communication & Domains Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Masud Rahman Dalhousie University, Mahmoud Alfadel University of Waterloo
14:14
4m
Talk
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu
Pre-print
14:18
4m
Talk
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington
DOI Pre-print Media Attached
14:22
4m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:26
4m
Talk
Inspect4py: A Knowledge Extraction Framework for Python Code Repositories
Data and Tool Showcase Track
Rosa Filgueira St. Andrews University, Daniel Garijo Universidad Politécnica de Madrid
14:30
4m
Talk
DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research
Data and Tool Showcase Track
Keerthana Muthu Subash Carleton University, Canada, Lakshmi Prasanna Kumar Carleton University, Canada, Sri Lakshmi Vadlamani Carleton University, Canada, Preetha Chatterjee Drexel University, USA, Olga Baysal Carleton University
DOI Pre-print Media Attached
20:00 - 20:50
Session 6: Maintenance & TestingData and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Ajay Jha University of Alberta, Amjed Tahir Massey University
20:25
4m
Talk
Methods2Test: A dataset of focal methods mapped to test cases
Data and Tool Showcase Track
Michele Tufano Microsoft, Shao Kun Deng Microsoft Corporation, Neel Sundaresan Microsoft Corporation, Alexey Svyatkovskiy
20:29
4m
Talk
npm-filter: Automating the mining of dynamic information from npm packages
Data and Tool Showcase Track
Ellen Arteca Northeastern University, Alexi Turcotte Northeastern University
Pre-print Media Attached
20:33
4m
Talk
ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference
Data and Tool Showcase Track
Kevin Jesse University of California, Davis, Prem Devanbu Department of Computer Science, University of California, Davis
DOI Pre-print
21:00 - 21:50
Session 7: Developer Wellbeing & Project CommunicationTechnical Papers / Data and Tool Showcase Track / Industry Track at MSR Main room - odd hours
Chair(s): Bram Adams Queen's University, Kingston, Ontario
21:14
4m
Talk
The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories
Data and Tool Showcase Track
Melanie Warrick University of Vermont, Samuel F. Rosenblatt University of Vermont, Jean-Gabriel Young University of Vermont, amanda casari Open Source Programs Office, Google, Laurent Hébert-Dufresne University of Vermont, James P. Bagrow University of Vermont
DOI Pre-print Media Attached
21:18
4m
Talk
The Unexplored Treasure Trove of Phabricator Code Reviews
Data and Tool Showcase Track
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
21:22
4m
Talk
The Unsolvable Problem or the Unheard Answer? A Dataset of 24,669 Open-Source Software Conference Talks
Data and Tool Showcase Track
Kimberly Truong Oregon State University, Courtney Miller Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, USA, Christian Kästner Carnegie Mellon University
DOI Pre-print
21:26
4m
Talk
Exploring Apache Incubator Project Trajectories with APEX
Data and Tool Showcase Track
Anirudh Ramchandran University of California, Davis, Likang Yin University of California, Davis, Vladimir Filkov University of California at Davis

Thu 19 May

Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:50
Session 8: Large-Scale Mining & Software EcosystemsTechnical Papers / Data and Tool Showcase Track at MSR Main room - odd hours
Chair(s): Fiorella Zampetti University of Sannio, Italy, Gregorio Robles Universidad Rey Juan Carlos
03:21
4m
Talk
Lupa: A Platform for Large Scale Analysis of The Progamming Language Usage
Data and Tool Showcase Track
Anna Vlasova JetBrains Research, Maria Tigina JetBrains Research, ITMO University, Ilya Vlasov Saint Petersburg State University, Anastasiia Birillo JetBrains Research, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research; HSE University
DOI Pre-print
03:25
4m
Talk
GitDelver Enterprise Dataset (GDED): An Industrial Closed-source Dataset for Socio-Technical Research
Data and Tool Showcase Track
Nicolas Riquet University of Namur, Xavier Devroey University of Namur, Benoît Vanderose University of Namur
Pre-print
03:29
4m
Talk
DaSEA – A Dataset for Software Ecosystem Analysis
Data and Tool Showcase Track
Petya Buchkova IT University of Copenhagen, Joakim Hey Hinnerskov IT University of Copenhagen, Kasper Olsen IT University of Copenhagen, Rolf-Helge Pfeiffer IT University of Copenhagen
Pre-print Media Attached
03:33
4m
Talk
Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
Data and Tool Showcase Track
Kristiina Rahkema University of Tartu, Dietmar Pfahl University of Tartu
Pre-print Media Attached
04:00 - 04:50
Session 9: Scaling & CloudIndustry Track / Registered Reports / Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Lwin Khin Shar Singapore Management University
04:00
4m
Talk
SniP: An Efficient Stack Tracing Framework for Multi-threaded Programs
Data and Tool Showcase Track
Arun KP Indian Institute of Technology Kanpur, Saurabh Kumar Indian Institute of Technology Kanpur, Debadatta Mishra , Biswabandan Panda Indian Institute of Technology Bombay
DOI Pre-print
04:04
4m
Talk
Tooling for Time- and Space-efficient git Repository Mining
Data and Tool Showcase Track
Fabian Heseding Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Willy Scheibel Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Jürgen Döllner Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam
04:08
4m
Talk
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Cedric Richter Carl von Ossietzky Universität Oldenburg / University of Oldenburg, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg
Pre-print Media Attached
05:00 - 05:50
Session 10: SecurityTechnical Papers / Data and Tool Showcase Track / Registered Reports at MSR Main room - odd hours
Chair(s): Triet Le The University of Adelaide
05:18
4m
Talk
Vul4J: A Dataset of Reproducible Java Vulnerabilities Geared Towards the Study of Program Repair TechniquesData and Tool Showcase Award
Data and Tool Showcase Track
Quang-Cuong Bui Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology, Nicolás E. Díaz Ferreyra Hamburg University of Technology
Pre-print Media Attached
05:22
4m
Talk
AndroOBFS: Time-tagged Obfuscated Android Malware Dataset with Family Information
Data and Tool Showcase Track
Saurabh Kumar Indian Institute of Technology Kanpur, Debadatta Mishra , Biswabandan Panda Indian Institute of Technology Bombay, Sandeep K. Shukla Indian Institute of Technology Kanpur
DOI Pre-print Media Attached
05:26
4m
Talk
TriggerZoo: A Dataset of Android Applications Automatically Infected with Logic Bombs
Data and Tool Showcase Track
Jordan Samhi University of Luxembourg, Tegawendé F. Bissyandé SnT, University of Luxembourg, Jacques Klein University of Luxembourg
DOI Pre-print Media Attached
20:00 - 20:50
Session 12: Integration & Large-Scale MiningTechnical Papers / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Jin L.C. Guo McGill University, Amjed Tahir Massey University
20:32
4m
Talk
TwinDroid: A Dataset of Android app System call traces and Trace Generation Pipeline
Data and Tool Showcase Track
Asma Razgallah Université du Québec à Chicoutimi, Canada, Raphael Khoury Université du Québec à Chicoutimi, Canada, Jean-Baptiste Poulet Université du Québec à Chicoutimi, Canada
21:00 - 21:50
Session 13: Security & QualityTechnical Papers / Data and Tool Showcase Track / Registered Reports / Industry Track at MSR Main room - odd hours
Chair(s): Gias Uddin University of Calgary, Canada
21:21
4m
Talk
ECench: An Energy Bug Benchmark of Ethereum Client Software
Data and Tool Showcase Track
Jinyoung Kim Sungkyunkwan University, Misoo Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University

Fri 20 May

Displayed time zone: Eastern Time (US & Canada) change

04:00 - 04:50
Session 14: Software Quality Technical Papers / Industry Track / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Kla Tantithamthavorn Monash University, Simone Scalabrino University of Molise
04:25
4m
Talk
Constructing Dataset of Functionally Equivalent Java Methods Using Automated Test Generation Techniques
Data and Tool Showcase Track
Yoshiki Higo Osaka University, Shinsuke Matsumoto Osaka University, Shinji Kusumoto Osaka University, Kazuya Yasuda Hitachi, Ltd.
Media Attached
11:00 - 11:50
Session 15: Collaboration & Open SourceRegistered Reports / Data and Tool Showcase Track / Technical Papers / Industry Track at MSR Main room - odd hours
Chair(s): Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
11:07
4m
Talk
FixJS: A Dataset of Bug-fixing JavaScript Commits
Data and Tool Showcase Track
Viktor Csuvik Department of Software Engineering, MTA-SZTE Research Group on Artificial Intelligence, University of Szeged, Szeged, Hungary, László Vidács University of Szeged, Hungary
File Attached
11:11
4m
Talk
A Time Series-Based Dataset of Open-Source Software Evolution
Data and Tool Showcase Track
Bruno L. Sousa UFMG, Mariza Bigonha Professor at Federal University of Minas Gerais, Kecia A. M. Ferreira CEFET-MG, Glaura C. Franco UFMG
DOI Pre-print Media Attached
11:15
4m
Talk
LAGOON: An Analysis Tool for Open Source Communities
Data and Tool Showcase Track
Sourya Dey Galois, Inc., Walt Woods Galois, Inc.
Pre-print Media Attached
11:19
4m
Talk
A Versatile Dataset of Agile Open Source Software Projects
Data and Tool Showcase Track
Vali Tawosi University College London, Afnan Al-Subaihin University College London, Rebecca Moussa University College London, Federica Sarro University College London
Link to publication DOI Pre-print Media Attached
14:00 - 15:00
Session 16: Non-functional Properties (Availability, Security, Legal Aspects)Industry Track / Technical Papers / Registered Reports / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Jin L.C. Guo McGill University
14:07
4m
Talk
A Large-scale Dataset of (Open Source) License Text VariantsData and Tool Showcase Award
Data and Tool Showcase Track
Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print

Mon 23 May

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Blended Technical Session 1 (Integration, Large-scale mining, and Software Ecosystems)Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Bogdan Vasilescu Carnegie Mellon University, USA
11:30
8m
Talk
Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
Data and Tool Showcase Track
Kristiina Rahkema University of Tartu, Dietmar Pfahl University of Tartu
Pre-print Media Attached
11:38
8m
Talk
A Large-scale Dataset of (Open Source) License Text VariantsData and Tool Showcase Award
Data and Tool Showcase Track
Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
11:46
8m
Talk
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Cedric Richter Carl von Ossietzky Universität Oldenburg / University of Oldenburg, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg
Pre-print Media Attached
11:54
8m
Talk
LAGOON: An Analysis Tool for Open Source Communities
Data and Tool Showcase Track
Sourya Dey Galois, Inc., Walt Woods Galois, Inc.
Pre-print Media Attached
12:02
8m
Talk
The Unexplored Treasure Trove of Phabricator Code Reviews
Data and Tool Showcase Track
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
13:30 - 15:00
Blended Technical Session 2 (Machine Learning and Information Retrieval) Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Preetha Chatterjee Drexel University, USA
14:31
8m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University

Tue 24 May

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
Blended Technical Session 3 (Smells and Maintenance)Technical Papers / Mining Challenge / Registered Reports / Data and Tool Showcase Track at Room 315+316
Chair(s): Andy Zaidman Delft University of Technology
09:45
8m
Talk
npm-filter: Automating the mining of dynamic information from npm packages
Data and Tool Showcase Track
Ellen Arteca Northeastern University, Alexi Turcotte Northeastern University
Pre-print Media Attached
11:00 - 12:15
Blended Technical Session 4 (Introspection, Vision, and Human Aspects)Technical Papers / Registered Reports / Data and Tool Showcase Track at Room 315+316
Chair(s): Ayushi Rastogi University of Groningen, The Netherlands
11:38
8m
Talk
The General Index of Software Engineering Papers
Data and Tool Showcase Track
Zeinab Abou Khalil Inria, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
15:30 - 17:00
Blended Technical Session 5 (Miscellaneous) Technical Papers / Data and Tool Showcase Track / Mining Challenge at Room 315+316
Chair(s): Luís Cruz Deflt University of Technology
16:00
8m
Talk
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington
DOI Pre-print Media Attached
16:08
8m
Talk
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu
Pre-print

Accepted Papers

Title
A Large-scale Dataset of (Open Source) License Text VariantsData and Tool Showcase Award
Data and Tool Showcase Track
DOI Pre-print
An Alternative Issue Tracking Dataset of Public Jira Repositories
Data and Tool Showcase Track
Pre-print Media Attached
AndroOBFS: Time-tagged Obfuscated Android Malware Dataset with Family Information
Data and Tool Showcase Track
DOI Pre-print Media Attached
ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction
Data and Tool Showcase Track
Pre-print
A Time Series-Based Dataset of Open-Source Software Evolution
Data and Tool Showcase Track
DOI Pre-print Media Attached
A Versatile Dataset of Agile Open Source Software Projects
Data and Tool Showcase Track
Link to publication DOI Pre-print Media Attached
Constructing Dataset of Functionally Equivalent Java Methods Using Automated Test Generation Techniques
Data and Tool Showcase Track
Media Attached
DaSEA – A Dataset for Software Ecosystem Analysis
Data and Tool Showcase Track
Pre-print Media Attached
Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
Data and Tool Showcase Track
Pre-print Media Attached
DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research
Data and Tool Showcase Track
DOI Pre-print Media Attached
ECench: An Energy Bug Benchmark of Ethereum Client Software
Data and Tool Showcase Track
Exploring Apache Incubator Project Trajectories with APEX
Data and Tool Showcase Track
FixJS: A Dataset of Bug-fixing JavaScript Commits
Data and Tool Showcase Track
File Attached
GitDelver Enterprise Dataset (GDED): An Industrial Closed-source Dataset for Socio-Technical Research
Data and Tool Showcase Track
Pre-print
Inspect4py: A Knowledge Extraction Framework for Python Code Repositories
Data and Tool Showcase Track
LAGOON: An Analysis Tool for Open Source Communities
Data and Tool Showcase Track
Pre-print Media Attached
Lupa: A Platform for Large Scale Analysis of The Progamming Language Usage
Data and Tool Showcase Track
DOI Pre-print
ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference
Data and Tool Showcase Track
DOI Pre-print
Methods2Test: A dataset of focal methods mapped to test cases
Data and Tool Showcase Track
npm-filter: Automating the mining of dynamic information from npm packages
Data and Tool Showcase Track
Pre-print Media Attached
ReCover: a Curated Dataset for Regression Testing Research
Data and Tool Showcase Track
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
DOI Pre-print Media Attached
SniP: An Efficient Stack Tracing Framework for Multi-threaded Programs
Data and Tool Showcase Track
DOI Pre-print
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Pre-print
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
The General Index of Software Engineering Papers
Data and Tool Showcase Track
DOI Pre-print
The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories
Data and Tool Showcase Track
DOI Pre-print Media Attached
The Unexplored Treasure Trove of Phabricator Code Reviews
Data and Tool Showcase Track
DOI Pre-print
The Unsolvable Problem or the Unheard Answer? A Dataset of 24,669 Open-Source Software Conference Talks
Data and Tool Showcase Track
DOI Pre-print
Tooling for Time- and Space-efficient git Repository Mining
Data and Tool Showcase Track
TriggerZoo: A Dataset of Android Applications Automatically Infected with Logic Bombs
Data and Tool Showcase Track
DOI Pre-print Media Attached
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Pre-print Media Attached
TwinDroid: A Dataset of Android app System call traces and Trace Generation Pipeline
Data and Tool Showcase Track
Vul4J: A Dataset of Reproducible Java Vulnerabilities Geared Towards the Study of Program Repair TechniquesData and Tool Showcase Award
Data and Tool Showcase Track
Pre-print Media Attached
:
: