Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022
Thu 19 May 2022 04:04 - 04:08 at MSR Main room - even hours - Session 9: Scaling & Cloud Chair(s): Lwin Khin Shar

Software projects under version control grow with each commit, accumulating up to hundreds of thousands of commits per repository. Especially for such large projects, the traversal of a repository and data extraction for static source code analysis poses a trade-off between granularity and speed.

We showcase the command-line tool pyrepositoryminer that combines a set of optimization approaches for efficient traversal and data extraction from git repositories while being adaptable to third-party and custom software metrics and data extractions. The tool is written in Python and combines bare repository access, in-memory storage, parallelization, caching, change-based analysis, and optimized communication between the traversal and custom data extraction components. The tool allows for both metrics written in Python and external programs for data extraction. A single-thread performance evaluation based on a basic mining use case shows a mean speedup of 15.6x to other freely available tools across four mid-sized open source projects. A multi-threaded execution allows for load distribution among cores and, thus, a mean speedup up to 86.9x using 12 threads.

Thu 19 May

Displayed time zone: Eastern Time (US & Canada) change

04:00 - 04:50
Session 9: Scaling & CloudIndustry Track / Registered Reports / Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Lwin Khin Shar Singapore Management University
04:00
4m
Talk
SniP: An Efficient Stack Tracing Framework for Multi-threaded Programs
Data and Tool Showcase Track
Arun KP Indian Institute of Technology Kanpur, Saurabh Kumar Indian Institute of Technology Kanpur, Debadatta Mishra , Biswabandan Panda Indian Institute of Technology Bombay
DOI Pre-print
04:04
4m
Talk
Tooling for Time- and Space-efficient git Repository Mining
Data and Tool Showcase Track
Fabian Heseding Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Willy Scheibel Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Jürgen Döllner Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam
04:08
4m
Talk
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Cedric Richter Carl von Ossietzky Universität Oldenburg / University of Oldenburg, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg
Pre-print Media Attached
04:12
7m
Talk
Improved Business Outcomes from Cloud Applications – using Integrated Process and Runtime Product Data Mining
Industry Track
Mahesh Venkataraman Accenture, Reuben George Accenture, Jeff Wilkinson Accenture
04:19
7m
Talk
Improve Quality of Cloud Serverless Architectures through Software Repository Mining
Industry Track
04:26
4m
Talk
Toward Granular Automatic Unit Test Case Generation
Registered Reports
Fabiano Pecorelli Tampere University, Giovanni Grano LocalStack, Fabio Palomba University of Salerno, Harald C. Gall University of Zurich, Andrea De Lucia University of Salerno
Pre-print
04:30
20m
Live Q&A
Discussions and Q&A
Technical Papers


Information for Participants
Thu 19 May 2022 04:00 - 04:50 at MSR Main room - even hours - Session 9: Scaling & Cloud Chair(s): Lwin Khin Shar
Info for room MSR Main room - even hours:

Click here to go to the room on Midspace