An Analysis of Students' Program Comprehension Processes in a Large Code Base (ICPC 2025 - Research Track)

Who

Anshul Shah, Thanh Tong, Elena Tomson, Steven Shi, William G. Griswold, Gerald Soosairaj

Track

ICPC 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 27 Apr 2025 14:30 - 14:40 at 205 - Education, Debugging, Dynamic Analysis Chair(s): Simone Scalabrino, Coen De Roover, Gema Rodríguez-Pérez

Abstract

Program comprehension (PC) literature typically focuses on industry professionals comprehending large code bases or novice programmers comprehending short programs. As a result, limited work has aimed to understand how intermediate programmers comprehend large code bases, especially with the goal of supporting learner’s incremental development of program comprehension expertise. Through the lens of the Block Model—a theory to support research on and teaching of PC—we aim to uncover 1) the comprehension process that intermediate programmers follow in terms of the Block Model (i.e., \textit{top-down, bottom-up, etc.}), and 2) common mappings between comprehension techniques used by intermediate programmers and comprehension blocks in the Block Model. We present a qualitative analysis of students’ ``process journals'' in which they described their PC process while modifying the open-source \texttt{idlelib} code base. Our results showed that students typically followed a \textit{top-down} and \textit{Text-first} approach to understand a feature in the \texttt{idlelib} code base. Our findings also reveal \textit{how} students used various program comprehension techniques (such as code navigation, using the IDE-based debugger, making experimental code changes, etc.) in terms of the Block Model. These findings make progress toward bridging our theoretical understanding of novices’ comprehension process in small programs and expert’s code comprehension process in large code bases by presenting a high sample size investigation of \textit{intermediate programmers’} PC processes in a large, existing code base. Finally, instructors can use our findings to understand which blocks in the Block Model are targeted by various PC techniques, which can enable intentional teaching activities to impart PC skills.

Anshul Shah

University of California, San Diego

United States

Thanh Tong

University of California, San Diego

Elena Tomson

University of California, San Diego

United States

Steven Shi

University of California, San Diego

William G. Griswold

University of California San Diego

United States

Gerald Soosairaj

University of California, San Diego

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sun 27 Apr
Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30	Education, Debugging, Dynamic AnalysisResearch Track / Early Research Achievements (ERA) / Replications and Negative Results (RENE) / Tool Demonstration at 205 Chair(s): Simone Scalabrino University of Molise, Coen De Roover Vrije Universiteit Brussel, Gema Rodríguez-Pérez Department of Computer Science, Mathematics, Physics and Statistics, University of British Columbia, Okanagan Campus

14:00 10m Talk		JavaWiz: A Trace-Based Graphical Debugger for Software Development Education Research Track Markus Weninger JKU Linz, Simon Grünbacher Institute for System Software; Johannes Kepler University Linz, Austria, Herbert Prähofer Johannes Kepler University Linz Pre-print
14:10 10m Talk		Pinpointing the Learning Obstacles of an Interactive Theorem Prover Research Track Sára Juhošová Delft University of Technology, Andy Zaidman TU Delft, Jesper Cockx Delft University of Technology Pre-print
14:20 10m Talk		AI-based automated grading of source code of introductory programming assignments Research Track Jayant Havare Indian Institute of technology - Bombay, Varsha Apte Indian Institute of technology - Bombay, Kaushikraj Maharajan Indian Institute of technology - Bombay, Nithin Chandra Gupta Samudrala Indian Institute of technology - Bombay, Ganesh Ramakrishnan Indian Institute of technology - Bombay, Srikanth Tamilselvam IBM Research, Sainath Vavilapalli Indian Institute of Technology - Bombay
14:30 10m Talk		An Analysis of Students' Program Comprehension Processes in a Large Code Base Research Track Anshul Shah University of California, San Diego, Thanh Tong University of California, San Diego, Elena Tomson University of California, San Diego, Steven Shi University of California, San Diego, William G. Griswold University of California San Diego, Gerald Soosairaj University of California, San Diego
14:40 6m Talk		OVERLORD: A C++ overloading inspector Tool Demonstration Botond Horváth ELTE Eötvös Loránd University, Budapest, Hungary, Richárd Szalay Eötvös Loránd University, Faculty of Informatics, Department of Programming Languages and Compilers, Zoltán Porkoláb ELTE Eötvös Loránd University, Budapest, Hungary Pre-print
14:46 6m Talk		Optimizing Code Runtime Performance through Context-Aware Retrieval-Augmented Generation Early Research Achievements (ERA) Manish Acharya Vanderbilt University, Yifan Zhang Vanderbilt University, Kevin Leach Vanderbilt University, Yu Huang Vanderbilt University
14:52 6m Talk		Investigating Execution-Aware Language Models for Code Optimization Replications and Negative Results (RENE) Federico Di Menna University of L'Aquila, Luca Traini University of L'Aquila, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Vittorio Cortellessa University of L'Aquila Pre-print
14:58 6m Talk		Understanding Data Access in Microservices Applications Using Interactive Treemaps Early Research Achievements (ERA) Maxime ANDRÉ Namur Digital Institute, University of Namur, Marco Raglianti Software Institute - USI, Lugano, Anthony Cleve University of Namur, Michele Lanza Software Institute - USI, Lugano Pre-print
15:04 6m Talk		Divergence-Driven Debugging: Understanding Behavioral Changes Between Two Program Versions Early Research Achievements (ERA) Rémi Dufloer Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France, Imen Sayar Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France, Anne Etien University of Lille, Lille, France, Steven Costiou INRIA Lille
15:10 10m Talk		Effectively Modeling UI Transition Graphs for Android Apps via Reinforcement Learning Research Track Wunan Guo School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Zhen Dong Fudan University, Liwei Shen Fudan University, Daihong Zhou School of Computer Science and Information Engineering, Shanghai Institute of Technology, Bin Hu Fudan University, Chen Zhang Fudan University, Hai Xue University of Shanghai for Science and Technology
15:20 10m Live Q&A		Session's Discussion: "Education, Debugging, Dynamic Analysis" Research Track