MSR 2023
Dates to be announced Melbourne, Australia
co-located with ICSE 2023

The MSR Data/Tool Showcase track aims to actively promote and recognize the creation of reusable datasets and tools that are designed and built not only for a specific research project, but for the MSR community as a whole. These datasets and tools should enable other practitioners and researchers to jumpstart their own research efforts, and also enable the reproducibility of earlier work. The MSR Data/Tool Showcase papers can be descriptions of datasets or tools built by the authors that can be used by other practitioners or researchers, and/or descriptions of the use of tools built by others to obtain specific research results.

Call for Papers

The MSR Data and Tools Showcase Track aims to actively promote and recognize the creation of reusable datasets and tools that are designed and built not only for a specific research project, but for the MSR community as a whole. These datasets and tools should enable other practitioners and researchers to jumpstart their research efforts, and also allows the reproducibility of earlier work. The MSR Data and Tools Showcase papers can be descriptions of datasets or tools built by the authors that can be used by other practitioners or researchers, and/or descriptions of the use of tools built by others to obtain specific research results.

MSR’23 Data and Tools Showcase Track will accept two types of submissions: (1) data showcase papers and (2) reusable tool showcase papers.

  1. Data showcase submissions are expected to include:

    • a description of the data source,
    • a description of the methodology used to gather the data (including provenance and the tool used to create/generate/gather the data, if any),
    • a description of the storage mechanism, including a schema if applicable,
    • if the data has been used by the authors or others, a description of how this was done including references to previously published papers,
    • a description of the originality of the dataset (that is, even if the dataset has been used in a published paper, its complete description must be unpublished) and similar existing datasets (if any),
    • ideas for future research questions that could be answered using the dataset,
    • ideas for further improvements that could be made to the dataset, and
    • any limitations and/or challenges in creating or using the dataset.

  2. Reusable Tool showcase submissions are expected to include:

    • a description of the tool, which includes the background, motivation, novelty, overall architecture, detailed design, and preliminary evaluation of the tool, as well as the link to download or access the tool,
    • a description of the design of the tool, and how to use the tool in practice,
    • clear installation instructions and example dataset that allow the reviewers to run the tool,
    • if the tool has been used by the authors or others, a description of how the tool was used, including references to previously published papers,
    • ideas for future reusability of the tool, and
    • any limitations of using the tool.

The dataset or tool should be made available at the time of submission of the paper for review but will be considered confidential until publication of the paper. The dataset or tool should include detailed instructions about how to set up the environment (e.g., requirements.txt), how to use the dataset or tool (e.g., how to import the data or how to access the data once it has been imported, how to use the tool with a running example).

At a minimum, upon publication of the paper, the authors should archive the data or tool on a persistent repository that can provide a digital object identifier (DOI) such as zenodo.org, figshare.com, Archive.org, or institutional repositories. In addition, the DOI-based citation of the dataset or the tool should be included in the camera-ready version of the paper. GitHub provides an easy way to make source code citable (with third tools and with a CITATION file).

Data and Tools showcase submissions are not: * empirical studies, or * datasets that are based on poorly explained or untrustworthy heuristics for data collection, or results of trivial application of generic tools.

If custom tools have been used to create the dataset, we expect the paper to be accompanied by the source code of the tools, along with clear documentation on how to run the tools to recreate the dataset. The tools should be open source, accompanied by an appropriate license; the source code should be citable, i.e., refer to a specific release and have a DOI. If you cannot provide the source code or the source code clause is not applicable (e.g., because the dataset consists of qualitative data), please provide a short explanation of why this is not possible.

Evaluation Criteria

The Review Criteria for the Data/Tool Showcase submissions are as follows:

  • value, usefulness, and reusability of the datasets or tools.
  • quality of the presentation.
  • clarity of relation with related work and its relevance to mining software repositories.
  • availability of the datasets or tools.

Important Dates

  • Paper Deadline: Thursday 26th January 2023
  • Author Notification: Tuesday 7th March 2023
  • Camera Ready Deadline: Thursday 16th March 2023

Submission

Submit your paper (maximum 4 pages, plus 1 additional page of references) via the HotCRP submission site: https://msr2023-data-tool.hotcrp.com/.

Submitted papers will undergo single-blind peer review. We opt for single-blind peer review (as opposed to the double-blind peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such a reference is likely to disclose the authors’ identity.

To make research datasets and research software accessible and citable, we further encourage authors to attend to the FAIR rules, i.e., data should be: Findable, Accessible, Interoperable, and Reusable.

Submissions must conform to the IEEE formatting instructions IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTeX users must use \documentclass[10pt,conference]{IEEEtran} without including the compsoc or compsocconf options).

Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere for the duration of consideration. ACM plagiarism policies and procedures shall be followed for cases of double submission. The submission must also comply with the IEEE Policy on Authorship. Please read the ACM Policy on Plagiarism, Misrepresentation, and Falsification and the IEEE - Introduction to the Guidelines for Handling Plagiarism Complaints before submitting.

Upon notification of acceptance, all authors of accepted papers will be asked to complete a copyright form and will receive further instructions for preparing their camera-ready versions. At least one author of each paper is expected to register and present the results at the MSR 2023 conference. All accepted contributions will be published in the conference electronic proceedings.