PyAnalyzer: An Effective and Practical Approach for Dependency Extraction from Python Code
Dependency extraction based on static analysis lays the groundwork for a wide range of applications. However, dynamic language features in Python make code behaviors obscure and nondeterministic; consequently, it poses huge challenges for static analyses to resolve symbol-level dependencies. Although prosperous techniques and tools are adequately available, they still lack sufficient capabilities to handle object changes, first-class citizens, varying call sites, and library dependencies. To address the fundamental difficulty for dynamic languages, this work proposes an effective and practical method namely PyAnalyzer for dependency extraction. PyAnalyzer uniformly models functions, classes, and modules into first-class heap objects, propagating the dynamic changes of these objects and class inheritance. This manner better simulates dynamic features like duck typing, object changes, and first-class citizens, resulting in high recall results without compromising precision. Moreover, PyAnalyzer leverages optional type annotations as a shortcut to express varying call sites and resolve library dependencies on demand. We collected two micro-benchmarks (278 small programs), two macro-benchmarks (59 real-world applications), and 191 real-world projects (10MSLOC) for comprehensive comparisons with 7 advanced techniques (i.e., Understand, Sourcetrail, Depends, ENRE, PySonar2, PyCG, and Type4Py). The results demonstrated that PyAnalyzer achieves a high recall and hence improves the 𝐹1 by 24.7% on average, at least 1.4x faster without a compromise of memory efficiency. Our work will benefit diverse client applications.
Thu 18 AprDisplayed time zone: Lisbon change
11:00 - 12:30 | Analysis and Debugging 2New Ideas and Emerging Results / Research Track at Luis de Freitas Branco Chair(s): Pedro Diniz | ||
11:00 15mTalk | Trace-based Multi-Dimensional Root Cause Localization of Performance Issues in Microservice Systems Research Track Chenxi Zhang Fudan University, Zhen Dong Fudan University, China, Xin Peng Fudan University, Bicheng Zhang Fudan University, Miao Chen Fudan University | ||
11:15 15mTalk | ReClues: Representing and indexing failures in parallel debugging with program variables Research Track Yi Song School of Computer Science, Wuhan University, Xihao Zhang School of Computer Science, Wuhan University, Xiaoyuan Xie School of Computer Science, Wuhan University, China, Quanming Liu School of Computer Science, Wuhan University, Ruizhi Gao Sonos Inc., Chenliang Xing School of Computer Science, Wuhan University | ||
11:30 15mTalk | PyAnalyzer: An Effective and Practical Approach for Dependency Extraction from Python Code Research Track Wuxia Jin Xi'an Jiaotong University, Shuo Xu Xi'an jiaotong university, Dawei Chen Xi'an Jiaotong University, Jiajun He Xi'an jiaotong university, Dinghong Zhong Xi'an jiaotong university, Ming Fan Xi'an Jiaotong University, Hongxu Chen Huawei Technologies Co., Ltd., Huijia Zhang Huawei Technologies Co Ltd, Ting Liu Xi'an Jiaotong University Media Attached | ||
11:45 15mTalk | Detecting Automatic Software Plagiarism via Token Sequence Normalization Research Track Timur Sağlam Karlsruhe Institute of Technology (KIT), Moritz Brödel Karlsruhe Institute of Technology (KIT), Larissa Schmid Karlsruhe Institute of Technology, Sebastian Hahner Karlsruhe Institute of Technology (KIT) DOI Pre-print | ||
12:00 15mTalk | NuzzleBug: Debugging Block-Based Programs in Scratch Research Track Pre-print | ||
12:15 7mTalk | Locating Buggy Segments in Quantum Program Debugging New Ideas and Emerging Results | ||
12:22 7mTalk | Beyond a Joke: Dead Code Elimination Can Delete Live Code New Ideas and Emerging Results Haoxin Tu Singapore Management University, Singapore, Lingxiao Jiang Singapore Management University, Debin Gao Singapore Management University, He Jiang Dalian University of Technology |