An Analysis of Students' Program Comprehension Processes in a Large Code Base
Program comprehension (PC) literature typically focuses on industry professionals comprehending large code bases or novice programmers comprehending short programs. As a result, limited work has aimed to understand how intermediate programmers comprehend large code bases, especially with the goal of supporting learner’s incremental development of program comprehension expertise. Through the lens of the Block Model—a theory to support research on and teaching of PC—we aim to uncover 1) the comprehension process that intermediate programmers follow in terms of the Block Model (i.e., \textit{top-down, bottom-up, etc.}), and 2) common mappings between comprehension techniques used by intermediate programmers and comprehension blocks in the Block Model. We present a qualitative analysis of students’ ``process journals'' in which they described their PC process while modifying the open-source \texttt{idlelib} code base. Our results showed that students typically followed a \textit{top-down} and \textit{Text-first} approach to understand a feature in the \texttt{idlelib} code base. Our findings also reveal \textit{how} students used various program comprehension techniques (such as code navigation, using the IDE-based debugger, making experimental code changes, etc.) in terms of the Block Model. These findings make progress toward bridging our theoretical understanding of novices’ comprehension process in small programs and expert’s code comprehension process in large code bases by presenting a high sample size investigation of \textit{intermediate programmers’} PC processes in a large, existing code base. Finally, instructors can use our findings to understand which blocks in the Block Model are targeted by various PC techniques, which can enable intentional teaching activities to impart PC skills.