Negative Results of Image Processing for Identifying Duplicate Questions on Stack Overflow
In the rapidly evolving landscape of developer communities, Q&A platforms serve as crucial resources for crowdsourcing developers’ knowledge. A notable trend is the increasing use of images to convey complex queries more effectively. However, the current state-of-the-art in duplicate question detection has not kept pace with this shift, predominantly concentrating on text-based analysis. Inspired by advancements in image processing and numerous studies in software engineering illustrating the promising future of image-based communication on social coding platforms, we delved into image-based techniques for identifying duplicate questions on Stack Overflow. When focusing solely on text analysis of Stack Overflow questions and omitting the use of images, our automated models overlook a significant aspect of the question. Previous research has demonstrated the complementary nature of images to text. To address this, we implemented two methods of image analysis: first, integrating the text from images into the question text, and second, evaluating the images based on their visual content using image captions. After a rigorous evaluation of our model, it became evident that the efficiency improvements achieved were relatively modest, approximately an average of 1%. This marginal enhancement falls short of what could be deemed a substantial impact. As an encouraging aspect, our work lays the foundation for easy replication and hypothesis validation, allowing future research to build upon our approach and explore novel solutions for more effective image-driven duplicate question detection.
Thu 24 OctDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
14:00 - 15:30 | Repository miningESEM Journal-First Papers / ESEM IGC / ESEM Technical Papers at Multimedia (B3 Building - Hall) Chair(s): Apostolos Ampatzoglou University of Macedonia | ||
14:00 20mFull-paper | Decoding Android Permissions: A Study of Developer Challenges and Solutions on Stack Overflow ESEM Technical Papers Sahrima Jannat Oishwee University of Saskatchewan, Zadia Codabux University of Saskatchewan, Natalia Stakhanova University of Saskatchewan | ||
14:20 20mFull-paper | Negative Results of Image Processing for Identifying Duplicate Questions on Stack Overflow ESEM Technical Papers | ||
14:40 20mFull-paper | Understanding Fairness in Software Engineering: Insights from Stack Exchange Sites ESEM Technical Papers Emeralda Sesari University of Groningen, Federica Sarro University College London, Ayushi Rastogi University of Groningen, The Netherlands DOI Pre-print | ||
15:00 15mIndustry talk | Reducing Events to Augment Log-based Anomaly Detection Models: An Empirical Study ESEM IGC Lingzhe Zhang Peking University, China, Tong Jia Institute for Artificial Intelligence, Peking University, Beijing, China, Kangjin Wang Alibaba Group, Mengxi Jia Peking University, Yong Yang , Ying Li School of Software and Microelectronics, Peking University, Beijing, China | ||
15:15 15mJournal Early-Feedback | The upper bound of information diffusion in code review ESEM Journal-First Papers Michael Dorner Blekinge Institute of Technology, Daniel Mendez Blekinge Institute of Technology and fortiss, Krzysztof Wnuk , Ehsan Zabardast Blekinge Institute of Technology, Jacek Czerwonka Developer Services, Microsoft Link to publication DOI Pre-print |