Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022

Stack Overflow (SO) is becoming an indispensable part of the modern software development workflow. However, navigating SO posts and comparing different solutions is time-consuming and cumbersome given the limited time, attention, and memory capacity of programmers. Recent research has proposed to summarize SO posts to concise text to help programmers quickly decide the relevance and quality of SO posts. Yet there is no large, comprehensive dataset of high-quality SO post summaries, which hinders the development and evaluation of post summarization techniques. We present SOSum, a dataset of 2278 popular SO posts with manually labeled summative sentences. Questions in SOSum cover 669 tags with a median view count of 253K and a median post score of 17. This dataset will foster research on sentence-level summarization of SO posts and has the potential to facilitate text summarization research on other types of textual software artifacts such as programming tutorials.

Wed 18 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 14:50
Session 5: Communication & Domains Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Masud Rahman Dalhousie University, Mahmoud Alfadel University of Waterloo
14:00
7m
Talk
Painting the Landscape of Automotive Software in GitHub
Technical Papers
Sangeeth Kochanthara Eindhoven University of Technology, Yanja Dajsuren Eindhoven University of Technology, Loek Cleophas Eindhoven University of Technology (TU/e) and Stellenbosch University (SU), Mark van den Brand Eindhoven University of Technology
Pre-print Media Attached
14:07
7m
Full-paper
Mining the Usage of Reactive Programming APIs: A Study on GitHub and Stack Overflow
Technical Papers
Carlos Zimmerle Federal University of Pernambuco, Kiev Gama Federal University of Pernambuco, Fernando Castor Utrecht University & Federal University of Pernambuco, José Murilo Filho Federal University of Pernambuco
DOI Pre-print
14:14
4m
Talk
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu
Pre-print
14:18
4m
Talk
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington
DOI Pre-print Media Attached
14:22
4m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:26
4m
Talk
Inspect4py: A Knowledge Extraction Framework for Python Code Repositories
Data and Tool Showcase Track
Rosa Filgueira St. Andrews University, Daniel Garijo Universidad Politécnica de Madrid
14:30
4m
Talk
DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research
Data and Tool Showcase Track
Keerthana Muthu Subash Carleton University, Canada, Lakshmi Prasanna Kumar Carleton University, Canada, Sri Lakshmi Vadlamani Carleton University, Canada, Preetha Chatterjee Drexel University, USA, Olga Baysal Carleton University
DOI Pre-print Media Attached
14:34
16m
Live Q&A
Discussions and Q&A
Technical Papers

Mon 23 May

Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:00
Blended Technical Session 2 (Machine Learning and Information Retrieval) Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Preetha Chatterjee Drexel University, USA
13:30
15m
Talk
Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health)
Technical Papers
Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University
Pre-print Media Attached
13:45
15m
Talk
GraphCode2Vec: Generic Code Embedding via Lexical and Program Dependence Analyses
Technical Papers
Wei Ma SnT, University of Luxembourg, Mengjie Zhao LMU Munich, Ezekiel Soremekun SnT, University of Luxembourg, Qiang Hu University of Luxembourg, Jie M. Zhang King's College London, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Xiaofei Xie Singapore Management University, Singapore, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print
14:00
15m
Talk
Senatus: A Fast and Accurate Code-to-Code Recommendation Engine
Technical Papers
Fran Silavong JP Morgan Chase & Co., Sean Moran JP Morgan Chase & Co., Antonios Georgiadis JP Morgan Chase & Co., Rohan Saphal JP Morgan Chase & Co., Robert Otter JP Morgan Chase & Co.
DOI Pre-print Media Attached
14:15
8m
Short-paper
Comments on Comments: Where Code Review and Documentation Meet
Technical Papers
Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University
DOI Pre-print File Attached
14:23
8m
Short-paper
On the Naturalness of Fuzzer Generated Code
Technical Papers
Rajeswari Hita Kambhamettu Carnegie Mellon University, John Billos Wake Forest University, Carolyn "Tomi" Oluwaseun-Apo Pennsylvania State University, Benjamin Gafford Carnegie Mellon University, Rohan Padhye Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
14:31
8m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:39
21m
Live Q&A
Discussions and Q&A
Technical Papers


Information for Participants
Wed 18 May 2022 14:00 - 14:50 at MSR Main room - even hours - Session 5: Communication & Domains Chair(s): Masud Rahman, Mahmoud Alfadel
Info for room MSR Main room - even hours:

Click here to go to the room on Midspace