LLMs Still Can't Avoid Instanceof: An investigation Into GPT-3.5, GPT-4 and Bard's Capacity to Handle Object-Oriented Programming Assignments (ICSE 2024 - Software Engineering Education and Training) - ICSE 2024

Fri 12 - Sun 21 April 2024 Lisbon, Portugal

Who

Bruno Pereira Cipriano, Pedro Alves

Track

ICSE 2024 Software Engineering Education and Training

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 18 Apr 2024 12:00 - 12:15 at Pequeno Auditório - LLM, NN and other AI technologies 3 Chair(s): Tushar Sharma

Abstract

Large Language Models (LLMs) have emerged as promising tools to assist students while solving programming assignments. However, object-oriented programming (OOP), with its inherent complexity involving the identification of entities, relationships, and responsibilities, is not yet mastered by these tools. Contrary to introductory programming exercises, there exists a research gap with regards to the behavior of LLMs in OOP contexts. In this study, we experimented with three prominent LLMs - GPT-3.5, GPT-4, and Bard - to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT). The findings revealed that while the models frequently achieved mostly working solutions to the exercises, they often overlooked the best practices of OOP. GPT-4 stood out as the most proficient, followed by GPT-3.5, with Bard trailing last. We advocate for a renewed emphasis on code quality when employing these models and explore the potential of pairing LLMs with AATs in pedagogical settings. In conclusion, while GPT-4 showcases promise, the deployment of these models in OOP education still mandates supervision.

Bruno Pereira Cipriano

Lusófona University, COPELABS

Pedro Alves

Lusófona University, COPELABS

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 18 Apr
Displayed time zone: Lisbon change

	11:00 - 12:30	LLM, NN and other AI technologies 3New Ideas and Emerging Results / Research Track / Software Engineering Education and Training / Software Engineering in Practice at Pequeno Auditório Chair(s): Tushar Sharma Dalhousie University

	11:00 15m Talk		Xpert: Empowering Incident Management with Query Recommendations via Large Language Models Research Track Yuxuan Jiang University of Michigan Ann-Arbor, Chaoyun Zhang Microsoft, Shilin He Microsoft Research, Zhihao Yang Peking University, Minghua Ma Microsoft Research, Si Qin Microsoft Research, Yu Kang Microsoft Research, Yingnong Dang Microsoft Azure, Saravan Rajmohan Microsoft 365, Qingwei Lin Microsoft, Dongmei Zhang Microsoft Research
	11:15 15m Talk		Tensor-Aware Energy Accounting Research Track Timur Babakol SUNY Binghamton, USA, Yu David Liu SUNY Binghamton DOI Pre-print
	11:30 15m Talk		LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems Software Engineering in Practice Mohamad Fakih University of California, Irvine, Rahul Dharmaji University of California, Irvine, Yasamin Moghaddas University of California, Irvine, Gustavo Quiros Siemens Technology, Tosin Ogundare Siemens Technology, Mohammad Al Faruque UCI
	11:45 15m Talk		Resolving Code Review Comments with Machine Learning Software Engineering in Practice Alexander Frömmgen Google, Jacob Austin Google, Peter Choy Google, Nimesh Ghelani Google, Lera Kharatyan Google, Gabriela Surita Google, Elena Khrapko Google, Pascal Lamblin Google, Pierre-Antoine Manzagol Google, Marcus Revaj Google, Maxim Tabachnyk Google, Danny Tarlow Google, Kevin Villela Google, Dan Zheng Google DeepMind, Satish Chandra Google, Inc, Petros Maniatis Google DeepMind
	12:00 15m Talk		LLMs Still Can't Avoid Instanceof: An investigation Into GPT-3.5, GPT-4 and Bard's Capacity to Handle Object-Oriented Programming Assignments Software Engineering Education and Training Bruno Pereira Cipriano Lusófona University, COPELABS, Pedro Alves Lusófona University, COPELABS
	12:15 7m Talk		Leveraging Large Language Models to Improve REST API Testing New Ideas and Emerging Results Myeongsoo Kim Georgia Institute of Technology, Tyler Stennett Georgia Institute of Technology, Dhruv Shah Georgia Institute of Technology, Saurabh Sinha IBM Research, Alessandro Orso Georgia Institute of Technology Pre-print
	12:22 7m Talk		LogExpert: Log-based Recommended Resolutions Generation using Large Language Model New Ideas and Emerging Results JiaboWang Beijing University of Posts and Telecommunications, guojun chu Beijing University of Posts and Telecommunications, Jingyu Wang , Haifeng Sun Beijing University of Posts and Telecommunications, Qi Qi , Yuanyi Wang Beijing University of Posts and Telecommunications, Ji Qi China Mobile (Suzhou) Software Technology Co., Ltd., Jianxin Liao Beijing University of Posts and Telecommunications

:

:

:

: