Restoring the Executability of Jupyter Notebooks by Automatic Upgrade of Deprecated APIs (ASE 2021 - Artifact Evaluation)

Who

Chenguang Zhu, Ripon Saha, Mukul Prasad, Sarfraz Khurshid

Track

ASE 2021 Artifact Evaluation

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 16 Nov 2021 23:18 - 23:21 at Kangaroo - Artefacts Plenary (Any Day Band 2) Chair(s): Aldeida Aleti, Tim Menzies

Abstract

Data scientists typically practice exploratory programming using computational notebooks, to comprehend new data and extract insights. To do this they iteratively refine their code, actively trying to re-use and re-purpose solutions created by other data scientists, in real time. However, recent studies have shown that a vast majority of publicly available notebooks cannot be executed out of the box. One of the prominent reasons is the deprecation of data science APIs used in such notebooks, due to the rapid evolution of data science libraries. In this work we propose RELANCER, an automatic technique that restores the executability of broken Jupyter Notebooks, in near real time, by upgrading deprecated APIs. RELANCER employs an iterative runtime error driven approach to identify and fix one API issue at a time. This is supported by a machine-learned model which uses the runtime error message to predict the kind of API repair needed - an update in API or package name, a parameter, or a parameter value. Then RELANCER creates a search space of candidate repairs by combining knowledge from API migration examples on GitHub as well as the API documentation and employs a second machine learned model to rank this space of candidate mappings. An evaluation of RELANCER on a curated dataset of 255 un-executable Jupyter Notebooks from Kaggle shows that RELANCER can successfully restore the executability of 56% of the subjects, while baselines relying on just GitHub examples and just API documentation can only fix 37% and 36% of the subjects respectively. Further, pursuant to its real-time use case, RELANCER can restore execution to 49% of subjects, within a 5 minute time limit, while a baseline lacking its machine learning models can only fix 24%.

Chenguang Zhu

University of Texas at Austin

United States

Ripon Saha

Fujitsu Laboratories of America, Inc.

United States

Mukul Prasad

Fujitsu Research of America

United States

Sarfraz Khurshid

The University of Texas at Austin

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 16 Nov
Displayed time zone: Hobart change

23:00 - 00:00	Artefacts Plenary (Any Day Band 2)Artifact Evaluation at Kangaroo Chair(s): Aldeida Aleti Monash University, Tim Menzies North Carolina State University

23:00 5m Day opening		Opening Artifact Evaluation A: Aldeida Aleti Monash University
23:05 7m Keynote		Keynote Artifact Evaluation Dirk Beyer LMU Munich, Germany
23:12 3m Talk		CiFi: Versatile Analysis of Class and Field Immutability Artifact Evaluation Tobias Roth Technische Universität Darmstadt, Dominik Helm Technische Universität Darmstadt, Michael Reif Technische Universität Darmstadt, Mira Mezini Technische Universität Darmstadt
23:15 3m Talk		Testing Your Question Answering Software via Asking Recursively Artifact Evaluation Songqiang Chen School of Computer Science, Wuhan University, Shuo Jin School of Computer Science, Wuhan University, Xiaoyuan Xie School of Computer Science, Wuhan University, China
23:18 3m Talk		Restoring the Executability of Jupyter Notebooks by Automatic Upgrade of Deprecated APIs Artifact Evaluation Chenguang Zhu University of Texas at Austin, Ripon Saha Fujitsu Laboratories of America, Inc., Mukul Prasad Fujitsu Research of America, Sarfraz Khurshid The University of Texas at Austin
23:21 3m Talk		Context Debloating for Object-Sensitive Pointer Analysis Artifact Evaluation Dongjie He UNSW Sydney, Jingbo Lu UNSW Sydney, Jingling Xue UNSW Sydney
23:24 3m Talk		Understanding and Detecting Performance Bugs in Markdown Compilers Artifact Evaluation Penghui Li , Yinxi Liu The Chinese University of Hong Kong, Wei Meng Chinese University of Hong Kong
23:27 5m Product release		Reuse graphs Artifact Evaluation P: Tim Menzies North Carolina State University
23:32 10m Talk		Most reused artefacts Artifact Evaluation
23:42 18m Live Q&A		Discussion Artifact Evaluation