Can ChatGPT emulate humans in software engineering surveys?
Context: There is a growing belief in the literature that large language models (LLMs), such as those employed by ChatGPT, can mimic human behavior in surveys. Gap: While the literature has shown promising results in social sciences and market research, there is scant evidence of its effectiveness in technical fields like software engineering. Objective: Inspired by previous work, this paper explores ChatGPT’s ability to replicate findings from prior software engineering research. Given the frequent use of surveys in this field, if LLMs can accurately emulate human responses, this technique could address common methodological challenges like recruitment difficulties, representational shortcomings, and respondent fatigue. Method: We prompted ChatGPT to reflect the behavior of a ‘mega-persona’ representing the demographic distribution of interest. We replicated surveys from 2019 to 2023 from leading SE conferences, examining ChatGPT’s proficiency in mimicking responses from diverse demographics. Results: Our findings reveal that ChatGPT can successfully replicate the outcomes of some studies, but in others, the results were not significantly better than a random baseline. Conclusions: This reflection paper discusses the challenges and potential research opportunities in leveraging LLMs for representing humans in software engineering surveys.
Thu 24 OctDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
14:00 - 15:30 | Empirical research methods and applicationsESEM Technical Papers / ESEM Emerging Results, Vision and Reflection Papers Track at Telensenyament (B3 Building - 1st Floor) Chair(s): Valentina Lenarduzzi University of Oulu | ||
14:00 20mFull-paper | Game Software Engineering: A Controlled Experiment Comparing Automated Content Generation Techniques ESEM Technical Papers Mar Zamorano López University College London, África Domingo Universidad San Jorge, Carlos Cetina Universitat Politècnica de València, Spain, Federica Sarro University College London | ||
14:20 20mFull-paper | Evaluating Software Modelling Recommendations: Towards Systematic Guidelines for Modelling ESEM Technical Papers | ||
14:40 20mFull-paper | What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claims ESEM Technical Papers Jason Jones Purdue University, Wenxin Jiang Purdue University, Nicholas Synovic Loyola University Chicago, George K. Thiruvathukal Loyola University Chicago and Argonne National Laboratory, James C. Davis Purdue University DOI Pre-print | ||
15:00 15mVision and Emerging Results | On the Creation of Representative Samples of Software Repositories ESEM Emerging Results, Vision and Reflection Papers Track June Gorostidi IN3 - UOC, Adem Ait University of Luxembourg, Jordi Cabot Luxembourg Institute of Science and Technology, Javier Luis Cánovas Izquierdo IN3 - UOC Pre-print | ||
15:15 15mVision and Emerging Results | Can ChatGPT emulate humans in software engineering surveys? ESEM Emerging Results, Vision and Reflection Papers Track Igor Steinmacher Northern Arizona University, Jacob Mcauley Penney NAU, Katia Romero Felizardo UTFPR-CP, Alessandro Garcia Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Marco Gerosa Northern Arizona University |