SANER 2024
Tue 12 - Fri 15 March 2024 Rovaniemi , Finland
Wed 13 Mar 2024 14:00 - 14:15 at LAPPI - API and Dependency Analysis Chair(s): Martin Monperrus

The evolution of Python requires accurate version identification to facilitate compatibility and ongoing support. We extend previous work on deep learning models for Python version identification, where LSTM and CodeBERT achieved a 92% accuracy on short code snippets. We further expand these results to larger realistic files, utilising code segmentation techniques for varying input granularities. These techniques ranged from per-line analysis to larger code segments. Our findings show that while LSTM with CodeBERT embeddings maintained high accuracy on short snippets, performance significantly drops on longer segments, particularly in balancing information retention and misclassification risks. Notably, import-statement analysis, despite being the most intuitive indicator of version requirements, reached only a 30% accuracy. This exposes the limitations of our approach when encountering rare or user-defined modules. The findings expose the limitations of deep learning for language version identification, and suggest that alternative approaches may be necessary for high accuracy on larger datasets.

Wed 13 Mar

Displayed time zone: Athens change

14:00 - 15:30
API and Dependency AnalysisResearch Papers / Reproducibility Studies and Negative Results (RENE) Track at LAPPI
Chair(s): Martin Monperrus KTH Royal Institute of Technology
14:00
15m
Talk
The Limits of the Identifiable: Challenges in Python Version Identification with Deep Learning
Reproducibility Studies and Negative Results (RENE) Track
Marcus Gerhold University of Twente, The Netherlands, Lola Solovyeva University of Twente, Vadim Zaytsev University of Twente, Netherlands
Pre-print
14:15
15m
Talk
Exploring Dependencies Among Inconsistencies to Enhance the Consistency Maintenance of Models
Research Papers
Luciano Marchezan Johannes Kepler Universität Linz, Wesley Assunção North Carolina State University, Edvin Herac , Saad Shafiq University of Southern California, Alexander Egyed Johannes Kepler University Linz
14:30
15m
Talk
BUMP: A Benchmark of Reproducible Breaking Dependency Updates
Research Papers
Frank Reyes Garcia KTH Royal Institute of Technology, Yogya Gamage KTH Royal Institute of Technology, Gabriel Skoglund KTH Royal Institute of Technology, Benoit Baudry KTH, Martin Monperrus KTH Royal Institute of Technology
14:45
15m
Talk
APIGen: Generative API Method Recommendation
Research Papers
Yujia Chen Harbin Institute of Technology, Shenzhen, Cuiyun Gao Harbin Institute of Technology, Muyijie Zhu Harbin Institute of Technology, Shenzhen, Qing Liao Harbin Institute of Technology, Yong Wang Anhui Polytechnic University, Guoai Xu Harbin Institute of Technology, Shenzhen
15:00
15m
Talk
A Multi-Metric Ranking with Label Correlations Approach for Library Migration Recommendations
Research Papers
Jiancheng Zhang SouthWest Petroleum University, Peng Wu Sichuan Tourism University, Qin Luo Southwest Petroleum University
15:15
15m
Talk
Adaptoring: Adapter Generation to Provide an Alternative API for a Library
Research Papers
Lars Reimann University of Bonn, Günter Kniesel-Wünsche University of Bonn
Pre-print