Improving API Knowledge Discovery with ML: A Case Study of Comparable API Methods
Fri 19 May 2023 11:30 - 11:45 at Meeting Room 103 - Program comprehension Chair(s): Oscar Chaparro
Developers constantly learn new APIs, but often lack necessary information from documentation, resorting instead to popular question-and-answer platforms such as Stack Overflow. In this paper, we investigate how to use recent machine-learning based knowledge extraction techniques to automatically identify pairs of comparable API methods and the sentences describing the comparison from Stack Overflow answers. We first built a prototype that can be stocked with a dataset of comparable API methods and provides tool-tips to users in search results and in API documentation. We conducted a user study with this tool based on a dataset of TensorFlow comparable API methods spanning 198 hand-annotated facts from Stack Overflow posts. This study confirmed that providing comparable API methods is useful in API learning: developers using our tool were significantly more aware of the comparable API methods and better understood the differences between them. We then created SOREL, an comparable API methods knowledge extraction tool trained on our hand-annotated corpus, which achieves a 71% precision and 55% recall at discovering our manually extracted facts and discovers 433 pairs of comparable API methods from thousands of unseen SO posts. This work highlights the merit of jointly studying programming assistance tools and constructing machine learning techniques to power them.
Wed 17 MayDisplayed time zone: Hobart change
Fri 19 MayDisplayed time zone: Hobart change
11:00 - 12:30 | Program comprehensionTechnical Track / Journal-First Papers at Meeting Room 103 Chair(s): Oscar Chaparro College of William and Mary | ||
11:00 15mTalk | Code Comprehension Confounders: A Study of Intelligence and Personality Journal-First Papers Link to publication Pre-print | ||
11:15 15mTalk | Identifying Key Classes for Initial Software Comprehension: Can We Do It Better? Technical Track Weifeng Pan Zhejiang Gongshang University, China, Xin Du Zhejiang Gongshang University, China, Hua Ming Oakland University, Dae-Kyoo Kim Oakland University, Zijiang Yang Xi'an Jiaotong University and GuardStrike Inc | ||
11:30 15mTalk | Improving API Knowledge Discovery with ML: A Case Study of Comparable API Methods Technical Track Daye Nam Carnegie Mellon University, Brad A. Myers Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University Pre-print | ||
11:45 15mTalk | Evidence Profiles for Validity Threats in Program Comprehension Experiments Technical Track Marvin Muñoz Barón University of Stuttgart, Marvin Wyrich Saarland University, Daniel Graziotin University of Stuttgart, Stefan Wagner University of Stuttgart Pre-print | ||
12:00 15mTalk | Developers’ Visuo-spatial Mental Model and Program Comprehension Technical Track Pre-print | ||
12:15 15mTalk | Two Sides of the Same Coin: Exploiting the Impact of Identifiers in Neural Code Comprehension Technical Track Shuzheng Gao Harbin institute of technology, Cuiyun Gao Harbin Institute of Technology, Chaozheng Wang Harbin Institute of Technology, Jun Sun Singapore Management University, David Lo Singapore Management University, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China |