Software Entity Recognition with Noise-Robust LearningRecorded talk
Recognizing software entities such as library names from free-form text is essential to enable many software engineering (SE) technologies, such as traceability link recovery, automated documentation, and API recommendation. While many approaches have been proposed to address this problem, they suffer from small entity vocabularies or noisy training data, hindering their ability to recognize software entities mentioned in sophisticated narratives. To address this challenge, we leverage the Wikipedia taxonomy to develop a comprehensive entity lexicon with 79K unique software entities in 12 fine-grained types, as well as a large labeled dataset of over 1.7M sentences. Then, we propose self-regularization, a noise-robust learning approach, to the training of our software entity recognition (SER) model by accounting for many dropouts. Results show that models trained with self-regularization outperform both their vanilla counterparts and state-of-the-art approaches on our Wikipedia benchmark and two Stack Overflow benchmarks. We release our models, data, and code for future research.
Software Entity Recognition with Noise-Robust Learning - Slides (Software Entity Recognition with Noise-Robust Learning - Talk@ASE 2023.pptx) | 2.54MiB |
Tue 12 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
15:30 - 17:00 | Testing Tools and TechniquesNIER Track / Research Papers / Tool Demonstrations at Room E Chair(s): Tim Menzies North Carolina State University | ||
15:30 12mTalk | Modeling Programmer Attention as Scanpath Prediction NIER Track Aakash Bansal University of Notre Dame, Chia-Yi Su University of Notre Dame, Zachary Karas Vanderbilt University, Yifan Zhang Vanderbilt University, Yu Huang Vanderbilt University, Toby Jia-Jun Li University of Notre Dame, Collin McMillan University of Notre Dame | ||
15:42 12mTalk | On Automated Assistants for Software Development: The Role of LLMs NIER Track Pre-print File Attached | ||
15:54 12mTalk | SmartBugs 2.0: An Execution Framework for Weakness Detection in Ethereum Smart Contracts Tool Demonstrations Monika di Angelo TU Wien, Thomas Durieux TU Delft, João F. Ferreira INESC-ID and IST, University of Lisbon, Gernot Salzer TU Wien Pre-print File Attached | ||
16:06 12mTalk | AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection Research Papers Yintong Huo The Chinese University of Hong Kong, Yichen LI The Chinese University of Hong Kong, Yuxin Su Sun Yat-sen University, Pinjia He Chinese University of Hong Kong, Shenzhen, Zifan Xie Huazhong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong Pre-print | ||
16:18 12mTalk | Aster: Automatic Speech Recognition System Accessibility Testing for Stutterers Research Papers Yi Liu Nanyang Technological University, Yuekang Li University of New South Wales, Gelei Deng Nanyang Technological University, Felix Juefei-Xu Meta AI, Yao Du University of California, Irvine, Cen Zhang Nanyang Technological University, Chengwei Liu Nanyang Technological University, Yeting Li Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Lei Ma University of Alberta, Yang Liu Nanyang Technological University, Yuekang Li University of New South Wales | ||
16:30 12mTalk | Software Entity Recognition with Noise-Robust LearningRecorded talk Research Papers Tai Nguyen University of Pennsylvania, Yifeng Di Purdue University, Joohan Lee University of Southern California, Muhao Chen University of Southern California, Tianyi Zhang Purdue University Pre-print Media Attached File Attached |