ARCLIN: Automated API Mention Resolution for Unformatted Texts
Thu 12 May 2022 13:25 - 13:30 at ICSE room 2-odd hours - Tools and Environments 6 Chair(s): Domenico Bianculli
Fri 27 May 2022 09:40 - 09:45 at Room 306+307 - Papers 18: Recommender Systems, tools and environments Chair(s): Christian Bird
Online technical forums (e.g., StackOverflow) are popular platforms for developers to discuss technical problems such as how to use specific Application Programming Interface (API), how to solve the programming tasks, or how to fix bugs in their codes. These discussions can often provide auxiliary knowledge of how to use the software that is not covered by the official documents. The automatic extraction of such knowledge will support a set of downstream tasks like API searching or indexing. However, unlike official documentation written by experts, discussions in open forums are made by regular developers who write in short and informal texts, including spelling errors or abbreviations. There are three major challenges for the accurate APIs recognition and linking mentioned APIs from unstructured natural language documents to an entry in the API repository: (1) distinguishing API mentions from common words; (2) identifying API mentions without a fully qualified name; and (3) disambiguating API mentions with similar method names but in a different library.
In this paper, to tackle these challenges, we propose an ARCLIN tool, which can effectively distinguish and link APIs without using human annotations. Specifically, we first design an API recognizer to automatically extract API mentions from natural language sentences by a Condition Random Field (CRF) on the top of a Bi-directional Long Short-Term Memory (Bi-LSTM) module, then we apply a context-aware scoring mechanism to compute the mention-entry similarity for each entry in an API repository. Compared to previous approaches with heuristic rules, our proposed tool without manual inspection outperforms by 8% in a high-quality dataset Py-mention, which contains 558 mentions and 2,830 sentences from five popular Python libraries. To our best knowledge, ARCLIN is the first approach to achieving full automation of API mention resolution from unformatted text without manually collected labels.
Wed 11 MayDisplayed time zone: Eastern Time (US & Canada) change
05:00 - 06:00 | Tools and ConfigurationsTechnical Track / SEIP - Software Engineering in Practice at ICSE room 3-odd hours Chair(s): Robert Chatley Imperial College London | ||
05:00 5mTalk | Challenges in Applying Continuous Experimentation: A Practitioners’ Perspective SEIP - Software Engineering in Practice Kevin Anderson Delft University of Technology, Denise Visser bol.com, Jan-Willem Mannen ING, Yuxiang Jiang Delft University of Technology, Arie van Deursen Delft University of Technology, Netherlands DOI Pre-print | ||
05:05 5mTalk | Reflekt: a Library for Compile-Time Reflection in Kotlin SEIP - Software Engineering in Practice Anastasiia Birillo JetBrains Research, Elena Lyulina JetBrains Research, Maria Malysheva JetBrains Research;Saint Petersburg State University, Vladislav Tankov JetBrains, ITMO University, Timofey Bryksin JetBrains Research; HSE University Pre-print Media Attached | ||
05:10 5mTalk | ARCLIN: Automated API Mention Resolution for Unformatted Texts Technical Track Yintong Huo The Chinese University of Hong Kong, Yuxin Su Sun Yat-sen University, Hongming Zhang The Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached | ||
05:15 5mTalk | On the Benefits and Limits of Incremental Build of Software Configurations: An Exploratory Study Technical Track Georges Aaron RANDRIANAINA Université de Rennes 1, IRISA, Xhevahire Tërnava Université de Rennes 1, INRIA/IRISA, Djamel Eddine Khelladi CNRS, France, Mathieu Acher Univ. Rennes 1, Inria, IRISA, Institut Universitaire de France (IUF) Pre-print Media Attached | ||
05:20 5mTalk | Causality in Configurable Software Systems Technical Track Clemens Dubslaff TU Dresden, Kallistos Weis Saarland University, Christel Baier TU Dresden, Germany, Sven Apel Saarland University Pre-print Media Attached | ||
05:25 5mTalk | A Scalable t-wise Coverage Estimator Technical Track Eduard Baranov Université Catholique de Louvain, Belgium, Sourav Chakraborty Indian Statistical Institute (ISI) , Kolkata, India, Axel Legay Université Catholique de Louvain, Belgium, Kuldeep S. Meel National University of Singapore, N. V. Vinodchandran University of Nebraska-Lincoln DOI Pre-print Media Attached |
Thu 12 MayDisplayed time zone: Eastern Time (US & Canada) change
Fri 27 MayDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | Papers 18: Recommender Systems, tools and environmentsTechnical Track / Journal-First Papers / NIER - New Ideas and Emerging Results / SEIP - Software Engineering in Practice at Room 306+307 Chair(s): Christian Bird Microsoft Research | ||
09:00 5mTalk | Predicting the Objective and Priority of Issue Reports in Software Repositories Journal-First Papers Maliheh Izadi Sharif University of Technology, Kiana Akbari Sharif University of technology, Abbas Heydarnoori Sharif University of Technology Link to publication DOI Pre-print Media Attached | ||
09:05 5mTalk | Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
09:10 5mTalk | Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning NIER - New Ideas and Emerging Results Martin Weyssow DIRO, Université de Montréal, Houari Sahraoui Université de Montréal, Bang Liu DIRO & Mila, Université de Montréal Pre-print Media Attached | ||
09:15 5mTalk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, Netherlands Pre-print Media Attached | ||
09:20 5mTalk | Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USA DOI Pre-print Media Attached | ||
09:25 5mTalk | FlakiMe: Laboratory-Controlled Test Flakiness Impact Assessment Technical Track Maxime Cordy University of Luxembourg, Luxembourg, Renaud Rwemalika University of Luxembourg, Adriano Franci University of Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Mark Harman University College London Pre-print Media Attached | ||
09:30 5mTalk | Semantic Image Fuzzing of AI Perception Systems Technical Track Trey Woodlief University of Virginia, Sebastian Elbaum University of Virginia, Kevin Sullivan University of Virginia DOI Pre-print Media Attached | ||
09:35 5mTalk | Understanding and improving artifact sharing in software engineering research Journal-First Papers Christopher Steven Timperley Carnegie Mellon University, Lauren Herckis Carnegie Mellon University, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USA Link to publication DOI Pre-print Media Attached | ||
09:40 5mTalk | ARCLIN: Automated API Mention Resolution for Unformatted Texts Technical Track Yintong Huo The Chinese University of Hong Kong, Yuxin Su Sun Yat-sen University, Hongming Zhang The Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached |