Write a Blog >>
MSR 2022
Dates and location to be announced
co-located with ICSE 2022

The International Conference on Mining Software Repositories (MSR) has hosted a mining challenge since 2006. With this challenge, we call upon everyone interested to apply their tools to a common dataset. The challenge is for researchers and practitioners to bravely use their mining tools and approaches on a dare.

Call for Mining Challenge Proposals

One of the secret ingredients behind the success of the International Conference on Mining Software Repositories (MSR) is its annual Mining Challenge, in which MSR participants can showcase their techniques, tools and creativity on a common data set. In true MSR fashion, this data set is a real data set contributed by researchers in the community, solicited through an open call. There are many benefits of sharing a data set for the MSR Mining Challenge. The selected challenge proposal explaining the data set will appear in the MSR 2022 proceedings, and the challenge papers using the data set will be required to cite the challenge proposal or an existing paper of the researchers about the selected data set. Furthermore, the authors of the data set will join the MSR 2022 organizing committee as Mining Challenge (co-)chair(s), who will oversee the reviewing process (e.g., recruiting a Challenge PC, managing submissions and review assignments). Finally, it is not uncommon for challenge data sets to feature in MSR and other publications well after the edition of the conference in which they appear! If you would like to submit your data set for consideration for the 2022 MSR Mining Challenge, please submit a one-page proposal with up to three pages of appendices at https://msr2022-challenge-proposals.hotcrp.com/, containing the following information:

  1. Title of data set.
  2. What does the data set contain?
  3. How large is it?
  4. How accessible is it and how can the data be obtained?
  5. How representative is it?
  6. Does it require specialized tools to mine it?
  7. What skills, infrastructure, and/or credentials would challenge participants need to work with the data set?
  8. What kinds of questions do you expect challenge participants to answer?
  9. A link to a (sub)sample of the data for the organizing committee to peruse (e.g., via GitHub, Zenodo, Figshare).

Each submission must conform to the ACM formatting instructions. Templates are available here.

The first task of the authors of the selected proposal will be to prepare the Call for Challenge Papers, which outlines the expected content and structure of submissions, as well as the technical details of how to access and analyze the data set. This call will be published on the MSR website on August 5th. By making the challenge data set available by late summer, we hope that many students will be able to use the challenge data set for their graduate class projects.

Important Dates

  • Deadline for proposals: July 1st, 2021
  • Notification: July 19th, 2021
  • Call for Challenge Papers Published: August 5th, 2021
  • Challenge PC formed: TBD
  • Submission Deadline for Challenge Papers: TBD