Inspect4py: A Knowledge Extraction Framework for Python Code Repositories
This works presents inspect4py, a static code analysis framework designed to automatically extract the main features, metadata and documentation of Python code repositories. Given an input folder with code, inspect4py uses abstract syntax trees and state of the art tools to find all functions, classes, tests, documentation, call graphs, module dependencies and control flows within all code files in that repository. Using these findings, inspect4py infers different ways of invoking a software component. We have evaluated our framework on 95 annotated repositories, obtaining promising results for software type classification (over 95% F1-score). With inspect4py, we aim to ease the understandability and adoption of software repositories by other researchers and developers.
Wed 18 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 14:50 | Session 5: Communication & Domains Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours Chair(s): Masud Rahman Dalhousie University, Mahmoud Alfadel University of Waterloo | ||
14:00 7mTalk | Painting the Landscape of Automotive Software in GitHub Technical Papers Sangeeth Kochanthara Eindhoven University of Technology, Yanja Dajsuren Eindhoven University of Technology, Loek Cleophas Eindhoven University of Technology (TU/e) and Stellenbosch University (SU), Mark van den Brand Eindhoven University of Technology Pre-print Media Attached | ||
14:07 7mFull-paper | Mining the Usage of Reactive Programming APIs: A Study on GitHub and Stack Overflow Technical Papers Carlos Zimmerle Federal University of Pernambuco, Kiev Gama Federal University of Pernambuco, Fernando Castor Utrecht University & Federal University of Pernambuco, José Murilo Filho Federal University of Pernambuco DOI Pre-print | ||
14:14 4mTalk | SoCCMiner: A Source Code-Comments and Comment-Context Miner Data and Tool Showcase Track Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu Pre-print | ||
14:18 4mTalk | SLNET: A Redistributable Corpus of 3rd-party Simulink Models Data and Tool Showcase Track Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington DOI Pre-print Media Attached | ||
14:22 4mTalk | SOSum: A Dataset of Stack Overflow Post Summaries Data and Tool Showcase Track Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University | ||
14:26 4mTalk | Inspect4py: A Knowledge Extraction Framework for Python Code Repositories Data and Tool Showcase Track | ||
14:30 4mTalk | DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research Data and Tool Showcase Track Keerthana Muthu Subash Carleton University, Canada, Lakshmi Prasanna Kumar Carleton University, Canada, Sri Lakshmi Vadlamani Carleton University, Canada, Preetha Chatterjee Drexel University, USA, Olga Baysal Carleton University DOI Pre-print Media Attached | ||
14:34 16mLive Q&A | Discussions and Q&A Technical Papers |