Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022 Location to be announced
co-located with ICSE 2022

Call for Papers

The Mining Software Repositories (MSR) conference is the premier conference for data science, machine learning, and artificial intelligence in software engineering. The goal of the conference is to improve software engineering practices by uncovering interesting and actionable information about software systems and projects using the vast amounts of software data such as source control systems, defect tracking systems, code review repositories, archived communications between project personnel, question-and-answer sites, CI build servers, and run-time telemetry. Mining this information can help to understand software development and evolution, software users, and runtime behavior; support the maintenance of software systems; improve software design/reuse; empirically validate novel ideas and techniques; support predictions about software development; and exploit this knowledge in planning future development. The goal of this two-day international conference is to advance the science and practice of software engineering with data-driven techniques. The 19th International Conference on Mining Software Repositories will be held on May 23-24, 2022.

Evaluation Criteria

Research papers are expected to describe new methodologies and/or provide novel research results, and should be evaluated scientifically. While a high degree of technical rigor is expected for long papers, short research papers should discuss controversial issues in the field, or describe interesting or thought-provoking ideas that are not yet fully developed. Accepted short papers will be presented in a short lightning talk. Relevant review criteria:

  • soundness of approach
  • relevance to software engineering
  • clarity of relation with related work
  • quality of presentation
  • quality of evaluation [for long papers]
  • ability to replicate [for long papers]
  • novelty

Submission Process

All authors should use the official “ACM Primary Article Template”, as can be obtained from the ACM Proceedings Template page. LaTeX users should use the sigconf option, as well as the review (to produce line numbers for easy reference by the reviewers) and anonymous (omitting author names) options. To that end, the following LaTeX code can be placed at the start of the LaTeX document:

\documentclass[sigconf,review,anonymous]{acmart}

\acmConference[MSR 2022]{MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories}{May 23–24, 2022}{Pittsburgh, PA, USA}

Submissions to the Technical Track can be made via the submission site by the submission deadline. We encourage authors to upload their paper info early (the PDF can be submitted later) to properly enter conflicts for anonymous reviewing. All submissions must adhere to the following requirements:

  • Submissions must not exceed the page limit (10 pages plus 2 additional pages of references for full papers; 4 pages plus 1 additional page of references for short papers). The page limit is strict, and it will not be possible to purchase additional pages at any point in the process (including after acceptance).
  • Submissions must strictly conform to the ACM formatting instructions. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.
  • Submissions must not reveal the authors’ identities. The authors must make every effort to honor the double-anonymous review process. In particular, the authors’ names must be omitted from the submission and references to their prior work should be in the third person. Further advice, guidance, and explanation about the double-anonymous review process can be found in the Q&A page for ICSE 2022.

Any submission that does not comply with these requirements is likely to be desk rejected by the PC Chairs without further review. In addition, by submitting to the MSR Technical Track, the authors acknowledge that they are aware of and agree to be bound by the following policies:

  • The ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. In particular, papers submitted to MSR 2022 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2022. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases (including immediate rejection and reporting of the incident to ACM/IEEE). To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.
  • The authorship policy of the ACM and the authorship policy of the IEEE.

Upon notification of acceptance, all authors of accepted papers will be asked to fill a copyright form and will receive further instructions for preparing the camera-ready version of their papers. At least one author of each paper is expected to register and present the paper at the MSR 2022 conference. All accepted contributions will be published in the electronic proceedings of the conference.

A selection of the best papers will be invited to an Empirical Software Engineering (EMSE) Special Issue. The authors of accepted papers that show outstanding contributions to the FOSS community will have a chance to self-nominate their paper for the MSR FOSS Impact Paper Award. Please note that providing a replication package is strongly recommended even in double-anonymous submissions, since not providing one effectively hinders the peer-review process. Since access to data and scripts is essential during peer review, we strongly recommend to archive data sets on online archival sites such as dropbox.com, zenodo.org or figshare.com (Instructions available in Open Science Policy below). The latter two even allow to receive a DOI and hence become citable.

Submission Link

Papers must be submitted through HotCRP: https://msr2022-technical.hotcrp.com/

Shadow PC

We are continuing the Shadow PC process established at MSR 2021. The Shadow PC provides an opportunity to recruit and train the next generation of MSR researchers on reviewing MSR papers. The primary audience is PhD students and Post Docs who have not yet served on the MSR PC. They will review papers submitted to MSR and have a parallel program committee. We will have experienced MSR researchers look at the reviews and comments and give feedback to the Shadow PC. This has been done in other venues like EuroSys and USENIX.

We are requesting authors to volunteer their paper to be reviewed by the Shadow PC. This is purely a learning experience, and the reviews from the Shadow PC will not be visible to the regular PC and will not impact the decision of the paper in MSR. The authors may request the reviews from the Shadow PC. The reviews and comments will be in a completely different HotCRP installation to prevent any cross-over. Therefore, we are requesting authors to please volunteer their papers. There will be an option in the submission page for this. When you volunteer you are not only helping future generations of MSR researchers become better reviewers, but you also will get more feedback on your work.

Important Dates

  • Abstract Deadline: Jan 17
  • Paper Deadline: Jan 20
  • Author Response Period: Feb 22 – Feb 24
  • Author Notification: March 8
  • Camera Ready Deadline: Late March

Open Science Policy

Openness in science is key to fostering progress via transparency, reproducibility and replicability. Our steering principle is that all research output should be accessible to the public and that empirical studies should be reproducible. In particular, we actively support the adoption of open data and open source principles. The following guidelines are recommendations and not mandatory. Your choice to use open science or not will not affect the review process for your paper. However, to increase reproducibility and replicability, we encourage all contributing authors to disclose:

  • the source code of relevant software used or proposed in the paper, including that used to retrieve and analyze data
  • the data used in the paper (e.g., evaluation data, anonymized survey data, etc.)
  • instructions for other researchers describing how to reproduce or replicate the results

Already upon submission, authors can privately share their anonymized data and software on preserved archives, such as Zenodo or Figshare (tutorial available here please make sure that any links shared during peer review are anonymized*). Zenodo accepts up to 50GB per dataset (more upon request). There is no need to use Dropbox or Google Drive. Once accepted, an option can be toggled to publish the data and scripts with an official DOI. Zenodo and Figshare accounts can easily be linked with GitHub repositories to automatically archive software releases. In the unlikely case that authors need to upload terabytes of data, <Archive.org> may be used. After acceptance, we encourage authors to self-archive pre-prints of their papers in open, preserved repositories such as arXiv.org. This is legal and allowed by all major publishers including ACM and IEEE, and it lets anybody in the world reach your paper. Note that you are usually not allowed to self-archive the PDF of the published article (that is, the publisher proof or the Digital Library version). Instead, use the manuscript with reviewer comments addressed, but before applying the camera-ready instructions and templates. Feel free to contact the MSR 2021 PC or proceedings chairs for more details. We recognise that anonymising artifacts such as source code is more difficult than preserving anonymity in a paper. We ask authors to take a best effort approach to not reveal their identities. We will also ask reviewers to avoid trying to identify authors by looking at commit histories and other such information that is not easily anonymised. Authors wanting to share GitHub repositories may want to look into using https://anonymous.4open.science/ which is an open source tool that helps you to quickly double-blind your repository. Please note that the success of the open science initiative depends on the willingness (and possibilities) of authors to disclose their data and that all submissions will undergo the same review process independent of whether or not they disclose their analysis code or data. We encourage authors who cannot disclose industrial or otherwise non-public data, for instance due to non-disclosure agreements, to provide an explicit (short) statement in the paper.

Accepted Papers and Attendance Expectation

Accepted papers will be permitted an additional page of content to allow authors to incorporate review feedback. The page limit for published papers will therefore be 11 pages for full papers (or 5 pages, for short papers), plus 2 pages which may only contain references.

After acceptance, the list of paper authors can not be changed under any circumstances and the list of authors on camera-ready papers must be identical to those on submitted papers. After acceptance paper titles can not be changed except by permission of the Program Co-Chairs, and only then when referees recommended a change for clarity or accuracy with paper content.

If a submission is accepted, at least one author of the paper is required to register for MSR 2022 and present the paper. [We will add more info on this as soon as the MSR 2022 format is finalized.]

Scope

The technical track of MSR 2021 solicits high-quality submissions on a wide range of topics related to artificial intelligence (AI), machine learning (ML), and data science (DS) in one or more of the following three main themes.

1. AI/ML/DS and SE

The analysis should aim to improve understanding of development processes and practices or aid in the development of new techniques or models to support software developers. This includes (but is not limited to) analysis or models for:

  • commits,
  • execution traces and logs,
  • interaction data,
  • code review data,
  • natural language artifacts,
  • software licenses and copyrights,
  • app store data,
  • programming language features,
  • release information,
  • CI logs,
  • deployment and delivery,
  • test data,
  • runtime information,
  • software ecosystems,
  • defect and software quality data,
  • human and social aspects of development,
  • development process,
  • energy profile data.
2. New techniques, tools, and models.

The techniques, tools, and models should facilitate new ways to mine, analyze, or model software data. A submission could include (but is not limited to) techniques, tools, or models to:

  • capture new forms of data,
  • integrate data from multiple sources,
  • visualize software data,
  • model software data,
  • solve SE problems,
  • improve AI/ML/DS.
3. Considerations related to AI/ML/DS and SE.

These submissions should reflect on the current state-of-the-art research methods or current practices in mining, analyzing, or modeling software data. These submissions can also propose new research methods or guidelines. This theme includes topics such as (but not limited to)

  • privacy of collected data,
  • ethics of mining, analyzing, or modelling software data,
  • biases in software data, analyses, and tools,
  • fairness in software data, analyses, and tools,
  • Replication studies.