Why AI Agents Still Need You: Findings from Developer-Agent Collaborations in the Wild
This program is tentative and subject to change.
Software Engineering Agents (SWE agents) can autonomously perform development tasks on benchmarks like SWE Bench, but still face challenges when tackling complex and ambiguous real-world tasks. Consequently, SWE agents are often designed to allow interactivity with developers, enabling collaborative problem-solving. To understand how developers collaborate with SWE agents and the communication challenges that arise in such interactions, we observed 19 developers using an in-IDE agent to resolve 33 open issues in repositories to which they had previously contributed. Participants successfully resolved about half of these issues, with participants solving issues incrementally having greater success than those using a one-shot approach. Participants who actively collaborated with the agent and iterated on its outputs were also more successful, though they faced challenges in trusting the agent’s responses and collaborating on debugging and testing. These results have implications for successful developer-agent collaborations, and for the design of more effective SWE agents.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
14:00 - 15:30 | |||
14:00 10mTalk | Why AI Agents Still Need You: Findings from Developer-Agent Collaborations in the Wild Research Papers Aayush Kumar Microsoft, Yasharth Bajpai Microsoft, Sumit Gulwani Microsoft, Gustavo Soares Microsoft, Emerson Murphy-Hill Microsoft | ||
14:10 10mTalk | The Cost of Downgrading Build Systems: A Case Study of Kubernetes Research Papers Gareema Ranjan University of Waterloo, Mahmoud Alfadel University of Calgary, Gengyi Sun University of Waterloo, Shane McIntosh University of Waterloo Pre-print | ||
14:20 10mTalk | Democratizing the Cryptocurrency Ecosystem by Just-In-Time Transformation of Mining Programs Research Papers Wei Liu Nanjing University, Zhenhua Li Tsinghua University, Feng Qian University of Southern California, Feiyu Jin Tsinghua University, Hao Lin Tsinghua University, Yannan Zheng Ant Group, Bo Xiao Ant Group, Xiaokang Qin Ant Group, Tianyin Xu University of Illinois at Urbana-Champaign | ||
14:30 10mTalk | Advancing Automated Ethical Profiling in SE: a Zero-Shot Evaluation of LLM Reasoning Research Papers Patrizio Migliarini University of L'Aquila, Italy, Mashal Afzal Memon University of L’Aquila, Italy, Marco Autili University of L'Aquila, Italy, Paola Inverardi Gran Sasso Science Institute Pre-print | ||
14:40 10mTalk | The Impact of the COVID-19 Pandemic on Women’s Contribution to Public Code Journal-First Track Annalí Casanueva Ifo Institute, Big Data Junior Research Group, Munich, Germany, Davide Rossi University of Bologna, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris, Stefano Zacchiroli LTCI, Télécom Paris, Institut Polytechnique de Paris, Palaiseau, France | ||
14:50 10mTalk | Understanding Feature Request Practice on GitHub via a Large-Scale Empirical Study Research Papers Jiajun Li Nanjing University of Aeronautics and Astronautics, Wenhua Yang Nanjing University of Aeronautics and Astronautics, Minxue Pan Nanjing University, Yu Zhou Nanjing University of Aeronautics and Astronautics | ||
15:00 10mTalk | Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping Research Papers Jingyu Xiao The Chinese University of Hong Kong, Yuxuan Wan The Chinese University of Hong Kong, Yintong Huo Singapore Management University, Singapore, Zixin Wang The Chinese University of Hong Kong, Xinyi Xu The Chinese University of Hong Kong, Wenxuan Wang Hong Kong University of Science and Technology, Zhiyao Xu Tsinghua University, Yuhang Wang Southwest University, Michael Lyu The Chinese University of Hong Kong | ||
15:10 10mTalk | Engineering Digital Systems for Humanity: a Research Roadmap Journal-First Track Marco Autili University of L'Aquila, Italy, Martina De Sanctis Gran Sasso Science Institute, Paola Inverardi Gran Sasso Science Institute, Patrizio Pelliccione Gran Sasso Science Institute, L'Aquila, Italy | ||
15:20 10mTalk | Multi-dimensional Assessment of CrowdSourced Testing Reports via LLMs Research Papers Yue Wang NanJing University, Yuan Zhao Laboratory of Data Intelligence and Interdisciplinary Innovation, Nanjing University, Shengcheng Yu Technical University of Munich, Zhenyu Chen Nanjing University | ||