Mon 15 MayDisplayed time zone: Hobart change
09:00 - 10:30 | Opening Session & Award TalksMSR Awards / MIP Award at Meeting Room 109 Chair(s): Emad Shihab Concordia Univeristy, Bogdan Vasilescu Carnegie Mellon University | ||
09:00 20mDay opening | Opening Session & Award Announcements MSR Awards Emad Shihab Concordia Univeristy, Patanamon Thongtanunam The University of Melbourne, Bogdan Vasilescu Carnegie Mellon University | ||
09:20 20mTalk | MSR 2023 Foundational Contribution Award MSR Awards | ||
09:40 20mTalk | MSR 2023 Ric Holt Early Career Achievement Award MSR Awards Li Li Beihang University | ||
10:00 30mTalk | MIP #1: Mining Source Code Repositories at Massive Scale Using Language Modeling MIP Award |
11:00 - 11:45 | SE for MLData and Tool Showcase Track / Technical Papers at Meeting Room 110 Chair(s): Sarah Nadi University of Alberta | ||
11:00 12mTalk | AutoML from Software Engineering Perspective: Landscapes and ChallengesDistinguished Paper Award Technical Papers Chao Wang Peking University, Zhenpeng Chen University College London, UK, Minghui Zhou Peking University Pre-print | ||
11:12 12mTalk | Characterizing and Understanding Software Security Vulnerabilities in Machine Learning Libraries Technical Papers Nima Shiri Harzevili York University, Jiho Shin York University, Junjie Wang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Song Wang York University, Nachiappan Nagappan Facebook | ||
11:24 6mTalk | DeepScenario: An Open Driving Scenario Dataset for Autonomous Driving System Testing Data and Tool Showcase Track Chengjie Lu Simula Research Laboratory and University of Oslo, Tao Yue Simula Research Laboratory, Shaukat Ali Simula Research Laboratory Pre-print | ||
11:30 6mTalk | NICHE: A Curated Dataset of Engineered Machine Learning Projects in Python Data and Tool Showcase Track Ratnadira Widyasari Singapore Management University, Singapore, Zhou Yang Singapore Management University, Ferdian Thung Singapore Management University, Sheng Qin Sim Singapore Management University, Singapore, Fiona Wee Singapore Management University, Singapore, Camellia Lok Singapore Management University, Singapore, Jack Phan Singapore Management University, Singapore, Haodi Qi Singapore Management University, Singapore, Constance Tan Singapore Management University, Singapore, Qijin Tay Singapore Management University, Singapore, David Lo Singapore Management University | ||
11:36 6mTalk | PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages Data and Tool Showcase Track Wenxin Jiang Purdue University, Nicholas Synovic Loyola University Chicago, Purvish Jajal Purdue University, Taylor R. Schorlemmer Purdue University, Arav Tewari Purdue University, Bhavesh Pareek Purdue University, George K. Thiruvathukal Loyola University Chicago and Argonne National Laboratory, James C. Davis Purdue University Pre-print |
11:50 - 12:35 | Documentation + Q&A IData and Tool Showcase Track / Technical Papers at Meeting Room 109 Chair(s): Ahmad Abdellatif Concordia University | ||
11:50 12mTalk | Evaluating Software Documentation Quality Technical Papers | ||
12:02 12mTalk | What Do Users Ask in Open-Source AI Repositories? An Empirical Study of GitHub Issues Technical Papers Zhou Yang Singapore Management University, Chenyu Wang Singapore Management University, Jieke Shi Singapore Management University, Thong Hoang CSIRO's Data61, Pavneet Singh Kochhar Microsoft, Qinghua Lu CSIRO’s Data61, Zhenchang Xing , David Lo Singapore Management University | ||
12:14 12mTalk | PICASO: Enhancing API Recommendations with Relevant Stack Overflow Posts Technical Papers Ivana Clairine Irsan Singapore Management University, Ting Zhang Singapore Management University, Ferdian Thung Singapore Management University, Kisub Kim Singapore Management University, David Lo Singapore Management University | ||
12:26 6mTalk | GIRT-Data: Sampling GitHub Issue Report Templates Data and Tool Showcase Track Nafiseh Nikehgbal Sharif University of Technology, Amir Hossein Kargaran LMU Munich, Abbas Heydarnoori Bowling Green State University, Hinrich Schütze LMU Munich Pre-print |
13:45 - 14:15 | |||
13:45 30mTalk | MIP #2: The Impact of Tangled Code Changes MIP Award |
14:20 - 15:15 | Language ModelsTechnical Papers at Meeting Room 109 Chair(s): Patanamon Thongtanunam University of Melbourne | ||
14:20 12mTalk | On Codex Prompt Engineering for OCL Generation: An Empirical Study Technical Papers Seif Abukhalaf Polytechnique Montreal, Mohammad Hamdaqa Polytechnique Montréal, Foutse Khomh Polytechnique Montréal | ||
14:32 12mTalk | Cross-Domain Evaluation of a Deep Learning-Based Type Inference System Technical Papers Bernd Gruner DLR Institute of Data Science, Tim Sonnekalb German Aerospace Center (DLR), Thomas S. Heinze Cooperative University Gera-Eisenach, Clemens-Alexander Brust German Aerospace Center (DLR) | ||
14:44 12mTalk | Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study Technical Papers Tim van Dam Delft University of Technology, Maliheh Izadi Delft University of Technology, Arie van Deursen Delft University of Technology Pre-print | ||
14:56 12mTalk | Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models Technical Papers Iman Saberi University of British Columbia Okanagan, Fatemeh Hendijani Fard University of British Columbia |
14:20 - 15:15 | Understanding DefectsRegistered Reports / Data and Tool Showcase Track / Technical Papers at Meeting Room 110 Chair(s): Matteo Paltenghi University of Stuttgart, Germany | ||
14:20 12mTalk | What Happens When We Fuzz? Investigating OSS-Fuzz Bug History Technical Papers Brandon Keller Rochester Institute of Technology, Benjamin S. Meyers Rochester Institute of Technology, Andrew Meneely Rochester Institute of Technology | ||
14:32 12mTalk | An Empirical Study of High Performance Computing (HPC) Performance Bugs Technical Papers Md Abul Kalam Azad University of Michigan - Dearborn, Nafees Iqbal University of Michigan - Dearborn, Foyzul Hassan University of Michigan - Dearborn, Probir Roy University of Michigan at Dearborn Pre-print | ||
14:44 6mTalk | Semantically-enriched Jira Issue Tracking Data Data and Tool Showcase Track Themistoklis Diamantopoulos Electrical and Computer Engineering Dept, Aristotle University of Thessaloniki, Dimitrios-Nikitas Nastos Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Andreas Symeonidis Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki Pre-print | ||
14:50 6mTalk | An exploratory study of bug introducing changes: what happens when bugs are introduced in open source software? Registered Reports Lukas Schulte Universitity of Passau, Anamaria Mojica-Hanke University of Passau and Universidad de los Andes, Mario Linares-Vasquez Universidad de los Andes, Steffen Herbold University of Passau | ||
14:56 6mTalk | HasBugs - Handpicked Haskell Bugs Data and Tool Showcase Track | ||
15:02 6mTalk | An Empirical Study on the Performance of Individual Issue Label Prediction Technical Papers |
15:45 - 16:30 | |||
15:45 45mTalk | Tutorial: Recognizing Developers' Emotions Using Non-invasive Biometrics Sensors Tutorials Nicole Novielli University of Bari |
16:35 - 17:20 | Ethics & EnergyTechnical Papers / Registered Reports at Meeting Room 109 Chair(s): Arumoy Shome Delft University of Technology | ||
16:35 12mTalk | Energy Consumption Estimation of API-usage in Mobile Apps via Static Analysis Technical Papers Abdul Ali Bangash University of Alberta, Canada, Qasim Jamal FAST National University, Kalvin Eng University of Alberta, Karim Ali University of Alberta, Abram Hindle University of Alberta Pre-print | ||
16:47 12mTalk | An Exploratory Study on Energy Consumption of Dataframe Processing Libraries Technical Papers Pre-print | ||
16:59 6mTalk | Understanding issues related to personal data and data protection in open source projects on GitHub Registered Reports Anne Hennig Karlsruhe Institute of Technology, Lukas Schulte Universitity of Passau, Steffen Herbold University of Passau, Oksana Kulyk IT University of Copenhagen, Denmark, Peter Mayer University of Southern Denmark | ||
17:05 12mTalk | Whistleblowing and Tech on Twitter Technical Papers Laura Duits Vrije Universiteit Amsterdam, Isha Kashyap Vrije Universiteit Amsterdam, Joey Bekkink Vrije Universiteit Amsterdam, Kousar Aslam Vrije Universiteit Amsterdam, Emitzá Guzmán Vrije Universiteit Amsterdam |
16:35 - 17:20 | SecurityTechnical Papers / Data and Tool Showcase Track at Meeting Room 110 Chair(s): Chanchal K. Roy University of Saskatchewan | ||
16:35 12mTalk | UNGOML: Automated Classification of unsafe Usages in Go Technical Papers Anna-Katharina Wickert TU Darmstadt, Germany, Clemens Damke University of Munich (LMU), Lars Baumgärtner Technische Universität Darmstadt, Eyke Hüllermeier University of Munich (LMU), Mira Mezini TU Darmstadt Pre-print File Attached | ||
16:47 12mTalk | Connecting the .dotfiles: Checked-In Secret Exposure with Extra (Lateral Movement) Steps Technical Papers Gerhard Jungwirth TU Wien, Aakanksha Saha TU Wien, Michael Schröder TU Wien, Tobias Fiebig Max-Planck-Institut für Informatik, Martina Lindorfer TU Wien, Jürgen Cito TU Wien Pre-print | ||
16:59 12mTalk | MANDO-HGT: Heterogeneous Graph Transformers for Smart Contract Vulnerability Detection Technical Papers Hoang H. Nguyen L3S Research Center, Leibniz Universität Hannover, Hannover, Germany, Nhat-Minh Nguyen Singapore Management University, Singapore, Chunyao Xie L3S Research Center, Leibniz Universität Hannover, Germany, Zahra Ahmadi L3S Research Center, Leibniz Universität Hannover, Hannover, Germany, Daniel Kudenko L3S Research Center, Leibniz Universität Hannover, Germany, Thanh-Nam Doan Independent Researcher, Atlanta, Georgia, USA, Lingxiao Jiang Singapore Management University Pre-print Media Attached | ||
17:11 6mTalk | SecretBench: A Dataset of Software Secrets Data and Tool Showcase Track Setu Kumar Basak North Carolina State University, Lorenzo Neil North Carolina State University, Bradley Reaves North Carolina State University, Laurie Williams North Carolina State University Pre-print |
18:00 - 21:00 | |||
18:00 3hMeeting | MSR Dinner at Cargo Hall, South Wharf Technical Papers |
Tue 16 MayDisplayed time zone: Hobart change
09:00 - 09:45 | |||
09:00 45mKeynote | Towards Code-Aware AI Models for Code Keynotes |
09:50 - 10:30 | Tutorial #2Tutorials at Meeting Room 109 Chair(s): Alexander Serebrenik Eindhoven University of Technology | ||
09:50 40mTutorial | Tutorial: Mining and Analysing Collaboration in git Repositories with git2net Tutorials Christoph Gote Chair of Systems Design, ETH Zurich |
09:50 - 10:30 | Mining ChallengeMining Challenge at Meeting Room 110 Chair(s): Audris Mockus The University of Tennessee | ||
09:50 6mTalk | An Empirical Study to Investigate Collaboration Among Developers in Open Source Software (OSS) Mining Challenge Weijie Sun University of Alberta, Samuel Iwuchukwu University of Alberta, Abdul Ali Bangash University of Alberta, Canada, Abram Hindle University of Alberta Pre-print | ||
09:56 6mTalk | Insights into Female Contributions in Open-Source Projects Mining Challenge Arifa Islam Champa Idaho State University, Md Fazle Rabbi Idaho State University, Minhaz F. Zibran Idaho State University, Md Rakibul Islam University of Wisconsin - Eau Claire Pre-print | ||
10:02 6mTalk | The Secret Life of CVEs Mining Challenge Piotr Przymus Nicolaus Copernicus University in Toruń, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Stencel University of Warsaw Pre-print | ||
10:08 6mTalk | Evolution of the Practice of Software Testing in Java Projects Mining Challenge Anisha Islam Department of Computing Science, University of Alberta, Nipuni Tharushika Hewage Department of Computing Science, University of Alberta, Abdul Ali Bangash University of Alberta, Canada, Abram Hindle University of Alberta Pre-print | ||
10:14 6mTalk | Keep the Ball Rolling: Analyzing Release Cadence in GitHub Projects Mining Challenge Oz Kilic Carleton University, Nathaniel Bowness University of Ottawa, Olga Baysal Carleton University Pre-print |
11:00 - 11:45 | Documentation + Q&A IITechnical Papers / Data and Tool Showcase Track at Meeting Room 109 Chair(s): Maram Assi Queen's University | ||
11:00 12mTalk | Understanding the Role of Images on Stack Overflow Technical Papers Dong Wang Kyushu University, Japan, Tao Xiao Nara Institute of Science and Technology, Christoph Treude University of Melbourne, Raula Gaikovina Kula Nara Institute of Science and Technology, Hideaki Hata Shinshu University, Yasutaka Kamei Kyushu University Pre-print | ||
11:12 12mTalk | Do Subjectivity and Objectivity Always Agree? A Case Study with Stack Overflow Questions Technical Papers Saikat Mondal University of Saskatchewan, Masud Rahman Dalhousie University, Chanchal K. Roy University of Saskatchewan Pre-print | ||
11:24 6mTalk | GiveMeLabeledIssues: An Open Source Issue Recommendation System Data and Tool Showcase Track Joseph Vargovich Northern Arizona University, Fabio Marcos De Abreu Santos Northern Arizona University, USA, Jacob Penney Northern Arizona University, Marco Gerosa Northern Arizona University, Igor Steinmacher Northern Arizona University Pre-print Media Attached | ||
11:30 6mTalk | DocMine: A Software Documentation-Related Dataset of 950 GitHub Repositories Data and Tool Showcase Track | ||
11:36 6mTalk | PENTACET data - 23 Million Code Comments and 500,000 SATD comments Data and Tool Showcase Track Murali Sridharan University of Oulu, Leevi Rantala University of Oulu, Mika Mäntylä University of Oulu |
11:00 - 11:45 | Code SmellsTechnical Papers / Industry Track / Data and Tool Showcase Track at Meeting Room 110 Chair(s): Md Tajmilur Rahman Gannon University | ||
11:00 12mTalk | Don't Forget the Exception! Considering Robustness Changes to Identify Design Problems Technical Papers Anderson Oliveira PUC-Rio, João Lucas Correia Federal University of Alagoas, Leonardo Da Silva Sousa Carnegie Mellon University, USA, Wesley Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Daniel Coutinho PUC-Rio, Alessandro Garcia PUC-Rio, Willian Oizumi GoTo, Caio Barbosa UFAL, Anderson Uchôa Federal University of Ceará, Juliana Alves Pereira PUC-Rio Pre-print | ||
11:12 12mTalk | Pre-trained Model Based Feature Envy Detection Technical Papers mawenhao Wuhan University, Yaoxiang Yu Wuhan University, Xiaoming Ruan Wuhan University, Bo Cai Wuhan University | ||
11:24 6mTalk | CLEAN++: Code Smells Extraction for C++ Data and Tool Showcase Track Tom Mashiach Ben Gurion University of the Negev, Israel, Bruno Sotto-Mayor Ben Gurion University of the Negev, Israel, Gal Kaminka Bar Ilan University, Israel, Meir Kalech Ben Gurion University of the Negev, Israel | ||
11:30 6mTalk | DACOS-A Manually Annotated Dataset of Code Smells Data and Tool Showcase Track Himesh Nandani Dalhousie University, Mootez Saad Dalhousie University, Tushar Sharma Dalhousie University Pre-print File Attached | ||
11:36 6mTalk | What Warnings Do Engineers Really Fix? The Compiler That Cried Wolf Industry Track Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Ayushi Rastogi University of Groningen, The Netherlands Pre-print |
11:50 - 12:35 | Development Tools & Practices IIData and Tool Showcase Track / Industry Track / Technical Papers / Registered Reports at Meeting Room 109 Chair(s): Banani Roy University of Saskatchewan | ||
11:50 12mTalk | Automating Arduino Programming: From Hardware Setups to Sample Source Code Generation Technical Papers Imam Nur Bani Yusuf Singapore Management University, Singapore, Diyanah Binte Abdul Jamal Singapore Management University, Lingxiao Jiang Singapore Management University Pre-print | ||
12:02 6mTalk | A Dataset of Bot and Human Activities in GitHub Data and Tool Showcase Track Natarajan Chidambaram University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons | ||
12:08 6mTalk | Mining the Characteristics of Jupyter Notebooks in Data Science Projects Registered Reports Morakot Choetkiertikul Mahidol University, Thailand, Apirak Hoonlor Mahidol University, Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Siripen Pongpaichet Mahidol University, Thanwadee Sunetnanta Mahidol University, Tasha Settewong Mahidol University, Raula Gaikovina Kula Nara Institute of Science and Technology | ||
12:14 6mTalk | Optimizing Duplicate Size Thresholds in IDEs Industry Track Konstantin Grotov JetBrains Research, Constructor University, Sergey Titov JetBrains Research, Alexandr Suhinin JetBrains, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research Pre-print | ||
12:20 12mTalk | Boosting Just-in-Time Defect Prediction with Specific Features of C Programming Languages in Code Changes Technical Papers Chao Ni Zhejiang University, xiaodanxu College of Computer Science and Technology, Zhejiang university, Kaiwen Yang Zhejiang University, David Lo Singapore Management University |
11:50 - 12:35 | Software Libraries & EcosystemsTechnical Papers / Industry Track / Data and Tool Showcase Track at Meeting Room 110 Chair(s): Mehdi Keshani Delft University of Technology | ||
11:50 12mTalk | A Large Scale Analysis of Semantic Versioning in NPM Technical Papers Donald Pinckney Northeastern University, Federico Cassano Northeastern University, Arjun Guha Northeastern University and Roblox Research, Jonathan Bell Northeastern University Pre-print | ||
12:02 12mTalk | Phylogenetic Analysis of Reticulate Software Evolution Technical Papers Akira Mori National Institute of Advanced Industrial Science and Technology, Japan, Masatomo Hashimoto Chiba Institute of Technology, Japan | ||
12:14 6mTalk | PyMigBench: A Benchmark for Python Library Migration Data and Tool Showcase Track Mohayeminul Islam University of Alberta, Ajay Jha North Dakota State University, Sarah Nadi University of Alberta, Ildar Akhmetov University of Alberta Pre-print | ||
12:20 6mTalk | Determining Open Source Project Boundaries Industry Track Sophia Vargas Google | ||
12:26 6mTalk | Intertwining Communities: Exploring Libraries that Cross Software Ecosystems Technical Papers Kanchanok Kannee Nara Institute of Science and Technology, Raula Gaikovina Kula Nara Institute of Science and Technology, Supatsara Wattanakriengkrai Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology Pre-print |
13:45 - 14:30 | Tutorial #3Tutorials at Meeting Room 109 Chair(s): Alexander Serebrenik Eindhoven University of Technology | ||
13:45 45mTutorial | Tutorial: Beyond the leading edge. What else is out there? Tutorials Tim Menzies North Carolina State University Pre-print |
13:45 - 14:30 | Software QualityData and Tool Showcase Track / Technical Papers at Meeting Room 110 Chair(s): Tushar Sharma Dalhousie University | ||
13:45 12mTalk | Helm Charts for Kubernetes Applications: Evolution, Outdatedness and Security Risks Technical Papers Ahmed Zerouali Vrije Universiteit Brussel, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel Pre-print | ||
13:57 12mTalk | Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort? Technical Papers Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel Pre-print | ||
14:09 12mTalk | Method Chaining Redux: An Empirical Study of Method Chaining in Java, Kotlin, and Python Technical Papers Pre-print Media Attached | ||
14:21 6mTalk | Snapshot Testing Dataset Data and Tool Showcase Track |
14:35 - 15:15 | Defect PredictionData and Tool Showcase Track / Technical Papers at Meeting Room 109 Chair(s): Sarra Habchi Ubisoft | ||
14:35 12mTalk | Large Language Models and Simple, Stupid Bugs Technical Papers Kevin Jesse University of California at Davis, USA, Toufique Ahmed University of California at Davis, Prem Devanbu University of California at Davis, Emily Morgan University of California, Davis Pre-print | ||
14:47 12mTalk | The ABLoTS Approach for Bug Localization: is it replicable and generalizable?Distinguished Paper Award Technical Papers Feifei Niu University of Ottawa, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Wesley Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Liguo Huang Southern Methodist University, Jidong Ge Nanjing University, Bin Luo Nanjing University, Alexander Egyed Johannes Kepler University Linz Pre-print File Attached | ||
14:59 6mTalk | LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations Data and Tool Showcase Track Catherine Tony Hamburg University of Technology, Markus Mutas Hamburg University of Technology, Nicolás E. Díaz Ferreyra Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology Pre-print | ||
15:05 6mTalk | Defectors: A Large, Diverse Python Dataset for Defect Prediction Data and Tool Showcase Track Parvez Mahbub Dalhousie University, Ohiduzzaman Shuvo Dalhousie University, Masud Rahman Dalhousie University Pre-print |
14:35 - 15:15 | Human AspectsTechnical Papers / Data and Tool Showcase Track at Meeting Room 110 Chair(s): Alexander Serebrenik Eindhoven University of Technology | ||
14:35 12mTalk | A Study of Gender Discussions in Mobile Apps Technical Papers Mojtaba Shahin RMIT University, Australia, Mansooreh Zahedi The Univeristy of Melbourne, Hourieh Khalajzadeh Deakin University, Australia, Ali Rezaei Nasab Shiraz University Pre-print | ||
14:47 12mTalk | Tell Me Who Are You Talking to and I Will Tell You What Issues Need Your Skills Technical Papers Fabio Marcos De Abreu Santos Northern Arizona University, USA, Jacob Penney Northern Arizona University, João Felipe Pimentel Northern Arizona University, Igor Wiese Federal University of Technology, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University Pre-print | ||
14:59 6mTalk | She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models Technical Papers Pre-print Media Attached | ||
15:05 6mTalk | GitHub OSS Governance File Dataset Data and Tool Showcase Track Yibo Yan University of California, Davis, Seth Frey University of California, Davis, Amy Zhang University of Washington, Seattle, Vladimir Filkov University of California at Davis, USA, Likang Yin University of California at Davis Pre-print |
15:45 - 17:30 | Closing SessionVision and Reflection / MSR Awards at Meeting Room 109 Chair(s): Patanamon Thongtanunam The University of Melbourne | ||
15:45 20mTalk | MSR 2023 Doctoral Research Award MSR Awards Eman Abdullah AlOmar Stevens Institute of Technology | ||
16:05 30mTalk | Open Source Software Digital Sociology: Quantifying and Understanding Large Complex Open Source Ecosystems Vision and Reflection Minghui Zhou Peking University | ||
16:35 30mTalk | Human-Centered AI for SE: Reflection and Vision Vision and Reflection David Lo Singapore Management University | ||
17:05 25mDay closing | Closing MSR Awards Emad Shihab Concordia Univeristy |
Accepted Papers
Call for Papers
The International Conference on Mining Software Repositories (MSR) is the premier conference for data science (DS), machine learning (ML), and artificial intelligence (AI) in software engineering. There are vast amounts of data available in software-related repositories, such as source control systems, defect trackers, code review repositories, app stores, archived communications between project personnel, question-and-answer sites, CI build servers, package registries, and run-time telemetry. The MSR conference invites significant research contributions in which such data play a central role. MSR research track submissions using data from software repositories, either solely or combined with data from other sources, can take many forms, including: studies applying existing DS/ML/AI techniques to better understand the practice of software engineering, software users, and software behavior; empirically-validated applications of existing or novel DS/ML/AI-based techniques to improve software development and support the maintenance of software systems; and cross-cutting concerns around the engineering of DS/ML/AI-enabled software systems.
The 20th International Conference on Mining Software Repositories will be held on May 15-16, 2023, in Melbourne, Australia.
Evaluation Criteria
We invite both full (maximum 10 pages plus 2 additional pages of references) as well as short (4 pages plus references) papers to the Research Track. Full papers are expected to describe new techniques and/or provide novel research results, should have a high degree of technical rigor, and should be evaluated scientifically. Short papers are expected to discuss controversial issues in the field, or describe interesting or thought-provoking ideas that are not yet fully developed. Submissions will be evaluated according to the following criteria:
- Soundness: The extent to which the paper’s contributions (be it novel approaches, applications of existing techniques to new problems, empirical studies, or otherwise) address its research questions and are supported by rigorous application of appropriate research methods. It is expected that short papers provide narrower contributions, and therefore more limited evaluations, compared to full papers.
- Relevance: The extent to which the paper argues or demonstrates that its contributions help fill an important knowledge gap or help solve an important practical problem in the field of software engineering.
- Novelty: The extent to which the paper’s contributions are sufficiently original with respect to the state of the art or add substantially to the state of the knowledge. Note: This does not in any way discourage well-motivated replication studies.
- Presentation: The extent to which the paper’s rhetorical or logical argumentation is well structured and clear, the contributions are clearly articulated, the figures and tables are legible, and the use of the English language is adequate. All submissions must adhere to the formatting instructions provided below.
- Replicability: The extent to which the paper’s claims can be independently verified through available replication packages and/or sufficient information included in the paper to understand how data was obtained, analyzed, and interpreted, or how a proposed technique works. All submissions are expected to adhere to the Open Science policy below.
Junior PC (new for 2023)
Following two successful editions of the MSR Shadow PC in 2021 and 2022 (see also this paper and this presentation for more context), MSR 2023 will integrate the junior reviewers into the main technical track program committee!
The main goal remains unchanged: to train the next generation of MSR (and, more broadly, SE) reviewers and program committee members, in response to a widely-recognized challenge of scaling peer review capacity as the research community and volume of submissions grows over time. As with the previous Shadow PC, the primary audience for the Junior PC is early-career researchers (PhD students, postdocs, new faculty members, and industry practitioners) who are keen to get more involved in the academic peer-review process but have not yet served on a technical research track program committee at big international SE conferences (e.g., ICSE, ESEC/FSE, ASE, MSR, ICSME, SANER).
Prior to the MSR submission deadline, all PC members, including the junior reviewers, will receive guidance on review quality, confidentiality, and ethics standards, how to write good reviews, and how to participate in discussions (see ACM reviewers’ responsibilities). Junior reviewers will then serve alongside regular PC members on the main technical track PC, participating fully in the review process, including author responses and PC discussions to reach consensus. In addition, Junior PC members will receive feedback on how to improve their reviews throughout the process.
All submissions to the MSR research track will be reviewed jointly by both regular and junior PC members, as part of the same process. We expect that each paper will receive three reviews from regular PC members and two additional reviews from Junior PC members. The final decisions will be made by consensus among all reviewers, as always.
Based on our experience with the MSR Shadow PC in 2021 and 2022, we expect that the addition of junior reviewers to each paper will increase the overall quality of reviews the authors receive, since junior reviewers will typically have a deep understanding of recent topics, and can thus provide deep technical feedback on the subject.
A list of Junior PC is available at the Junior PC webpage.
Submission Process
Submissions must conform to the IEEE conference proceedings template, specified in the IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTeX users must use\documentclass[10pt,conference]{IEEEtran}
without including the compsoc
or compsocconf
options).
Submissions to the Research Track can be made via the submission site by the submission deadline. However, we encourage authors to submit at least the paper abstract and author details well in advance of the deadline, to leave enough time to properly enter conflicts of interest for anonymous reviewing. All submissions must adhere to the following requirements:
- Submissions must not exceed the page limit (10 pages plus 2 additional pages of references for full papers; 4 pages plus 1 additional page of references for short papers). The page limit is strict, and it will not be possible to purchase additional pages at any point in the process (including after acceptance).
- Submissions must strictly conform to the IEEE Conference Proceedings Formatting Guidelines. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.
- Submissions must not reveal the authors’ identities. The authors must make every effort to honor the double-anonymous review process. In particular,
- Authors’ names must be omitted from the submission and accompanying artifacts (e.g., replication package).
- All references to the author’s prior work should be made in the third person.
- While authors have the right to upload preprints on arXiv or similar sites, they should avoid specifying that the manuscript was submitted to MSR 2023 to reduce the risk of accidental de-anonymization (the reviewers will also be instructed to not search for the paper titles on the Web).
- During review, authors should not publicly use the submission title on social media or otherwise.
See also the Q&A page for ICSE 2023 for further advice, guidance, and explanation about the double-anonymous review process.
Submissions should also include a supporting statement on the data availability, per the Open Science policy below.
Any submission that does not comply with these requirements is likely to be desk rejected by the PC Chairs without further review. In addition, by submitting to the MSR Research Track, the authors acknowledge that they are aware of and agree to be bound by the following policies:
- The ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. In particular, papers submitted to MSR 2023 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2023. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases (including immediate rejection and reporting of the incident to ACM/IEEE). To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.
- The authorship policy of the ACM and the authorship policy of the IEEE.
Authors will have a chance to see the reviews and respond to reviewer comments before any decision about the submission is made.
Upon notification of acceptance, all authors of accepted papers will be asked to fill a copyright form and will receive further instructions for preparing the camera-ready version of their papers. At least one author of each paper is expected to register and present the paper at the MSR 2023 conference. All accepted contributions will be published in the electronic proceedings of the conference.
A selection of the best papers will be invited to an Empirical Software Engineering (EMSE) Special Issue. The authors of accepted papers that show outstanding contributions to the FOSS community will have a chance to self-nominate their paper for the MSR FOSS Impact Award.
Open Science Policy
Openness in science is key to fostering progress via transparency and availability of all outputs produced at each investigative step. Transparency and availability of research outputs allow better reproducibility and replicability of empirical studies and empirical evaluations. Open science builds the core for excellence in evidence-based research.
The MSR conference actively supports the adoption of open data and open source principles. Indeed, we consider replicability as an explicit evaluation criterion. We expect all contributing authors to disclose the (anonymized and curated) data to increase reproducibility, replicability, and/or recoverability of the studies, provided that there are no ethical, legal, technical, economic, or sensible barriers preventing the disclosure. Please provide a supporting statement on the data availability in your submitted papers, including an argument for why (some of) the data cannot be made available, if that is the case.
Specifically, we expect all contributing authors to disclose:
- the source code of relevant software used or proposed in the paper, including that used to retrieve and analyze data.
- the data used in the paper (e.g., evaluation data, anonymized survey data, etc.)
- instructions for other researchers describing how to reproduce or replicate the results.
Fostering artifacts as open data and open source should be done as:
- Archived on preserved digital repositories such as zenodo.org, figshare.com, www.softwareheritage.org, osf.io, or institutional repositories. GitHub, GitLab, and similar services for version control systems do not offer properly archived and preserved data. Personal or institutional websites, consumer cloud storage such as Dropbox, or services such as Academia.edu and Researchgate.net do not provide properly archived and preserved data.
- Data should be released under a recognized open data license such as the CC0 dedication or the CC-BY 4.0 license when publishing the data.
- Software should be released under an open source license.
- Different open licenses, if mandated by institutions or regulations, are also permitted.
We encourage authors to make artifacts available upon submission (either privately or publicly) and upon acceptance (publicly).
We recognize that anonymising artifacts such as source code is more difficult than preserving anonymity in a paper. We ask authors to take a best effort approach to not reveal their identities. We will also ask reviewers to avoid trying to identify authors by looking at commit histories and other such information that is not easily anonymized. Authors wanting to share GitHub repositories may also look into using https://anonymous.4open.science/, which is an open source tool that helps you to quickly double-anonymize your repository.
For additional information on creating open artifacts and open access pre- and post-prints, please see this ICSE 2023 page.
Submission Link
Papers must be submitted through HotCRP: https://msr2023-technical.hotcrp.com
Important Dates
- Abstract Deadline: January 16, 2023 AoE
- Paper Deadline: January 19, 2023 AoE
- Author Response Period: February 22 – 24, 2023 AoE
- Author Notification: March 7, 2023 AoE
- Camera Ready Deadline: March 16, 2023 AoE
Accepted Papers and Attendance Expectation
Accepted papers will be permitted an additional page of content to allow authors to incorporate review feedback. Therefore, the page limit for published papers will be 11 pages for full papers (or 5 pages, for short papers), plus 2 pages which may only contain references.
- The official publication date is the date the proceedings are made available in the ACM or IEEE Digital Libraries. This date may be up to two weeks prior to the first day of the ICSE 2023 conference week. The official publication date affects the deadline for any patent filings related to published work.
- Purchases of additional pages in the proceedings are not allowed.
After acceptance, the list of paper authors can not be changed under any circumstances and the list of authors on camera-ready papers must be identical to those on submitted papers. After acceptance paper titles can not be changed except by permission of the Program Co-Chairs, and only then when referees recommended a change for clarity or accuracy with paper content.
If a submission is accepted, at least one author of the paper is required to register for MSR 2023 and present the paper. [We will add more info on this as soon as the MSR 2023 format is finalized.]