Software developers often use social media (such as Twitter) to share programming knowledge such as new tools, sample code snippets, and tips on programming. One of the topics they talk about is the software library. The tweets may contain useful information about a library. A good understanding of this information, e.g., on the developerโs views regarding a library can be beneficial to weigh the pros and cons of using the library as well as the general sentiments towards the library. However, it is not trivial to recognize a library sense of a word from its normal senses. For example, a tweet mentioning the word pandas may refer to the Python pandas library or to the animal. In this work, we created the first benchmark dataset and investigated the task to distinguish whether a tweet actually refers to a programming library or something else. Recently, the pre-trained Transformer model (PTM) has achieved great success in the fields of natural language processing and computer vision. Therefore, we extensively evaluated a broad set of modern PTMs, including both general-purpose and domain-specific ones, to solve this programming library recognition task in tweets. Experimental results show that the use of PTM can outperform the best-performing baseline methods by up to 5% - 43% in terms of F1-score on within-, cross-, and mixed-library settings.
Mon 16 MayDisplayed time zone: Eastern Time (US & Canada) change
20:10 - 20:50 | Session 8: Search and Reuse: Libraries & APIsResearch / Replications and Negative Results (RENE) at ICPC room Chair(s): Masud Rahman Dalhousie University | ||
20:10 7mTalk | On the Effectiveness of Pretrained Models for API Learning Research Mohammad Abdul Hadi University of British Columbia, Imam Nur Bani Yusuf Singapore Management University, Ferdian Thung Singapore Management University, Kien Luong School of Computing and Information Systems, Singapore Management University, Fatemeh Hendijani Fard University of British Columbia, Lingxiao Jiang Singapore Management University, David Lo Singapore Management University Media Attached | ||
20:17 7mTalk | Deep API Learning Revisited Replications and Negative Results (RENE) Pre-print Media Attached | ||
20:24 7mTalk | ARSeek: Identifying API Resource using Code and Discussion on Stack Overflow Research Kien Luong School of Computing and Information Systems, Singapore Management University, Mohammad Abdul Hadi University of British Columbia, Ferdian Thung Singapore Management University, Fatemeh Hendijani Fard University of British Columbia, David Lo Singapore Management University Media Attached | ||
20:31 7mTalk | Benchmarking Library Recognition in Tweets Research Ting Zhang Singapore Management University, Divya Prabha CHANDRASEKARAN Singapore Management University, Ferdian Thung Singapore Management University, David Lo Singapore Management University Pre-print Media Attached | ||
20:38 12mLive Q&A | Q&A-Paper Session 8 Research |