Identifying Multi-Parameter Constraint Errors in Python Data Science Library API Documentations
Modern AI- and Data-intensive software systems rely heavily on data science and machine learning libraries that provide essential algorithmic implementations and computational frameworks. These libraries expose complex APIs whose correct usage has to follow constraints among multiple interdependent parameters. Developers using these APIs are expected to learn about the constraints through the provided documentations and any discrepancy may lead to unexpected behaviors. However, maintaining correct and consistent multi-parameter constraints in API documentations remains a significant challenge for API compatibility and reliability. To address this challenge, we propose MPChecker, for detecting inconsistencies between code and documentation, specifically focusing on multi-parameter constraints. MPChecker identifies these constraints at the code level by exploring execution paths through symbolic execution and further extracts corresponding constraints from documentation using large language models (LLMs). We propose a customized fuzzy constraint logic to reconcile the unpredictability of LLM outputs and detects logical inconsistencies between the code and documentation constraints. We collected and constructed two datasets from four popular data science libraries and evaluated MPChecker on them. The results demonstrate that MPChecker can effectively detect inconsistency issues with the precision of 92.8%. We further report 14 detected inconsistency issues to the library developers, who have confirmed 11 issues at the time of writing.
Wed 25 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:30 | |||
16:00 25mTalk | Model Checking Guided Incremental Testing for Distributed Systems Research Papers Yu Gao Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dong Wang Institute of software, Chinese academy of sciences, Wensheng Dou Institute of Software Chinese Academy of Sciences, Wenhan Feng Institute of Software, Chinese Academy of Sciences, Yu Liang Institute of Software Chinese Academy of Sciences, Yuan Feng Wuhan Dameng Database Co., Ltd, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI | ||
16:25 25mTalk | Identifying Multi-Parameter Constraint Errors in Python Data Science Library API Documentations Research Papers Xiufeng Xu Nanyang Technological University, Fuman Xie University of Queensland, Chenguang Zhu Meta AI, Guangdong Bai University of Queensland, Sarfraz Khurshid University of Texas at Austin, Yi Li Nanyang Technological University DOI Pre-print | ||
16:50 25mTalk | Freesia: Verifying Correctness of TEE Communication with Concurrent Separation Logic Research Papers Fanlang Zeng Zhejiang University, Rui Chang Zhejiang University, Hongjian Liu Zhejiang University, Hangzhou, China DOI | ||
17:15 15mDemonstration | TBFV4J: An Automated Testing-Based Formal Verification Tool for Java Tool Demonstrations |
Aurora C is the third room in the Aurora wing.
When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.