CodeACT-R: A Cognitive Simulation Framework for Human Attention in Code Reading
Reading code is a fundamental activity in both software engineering and computer science education. Understanding the cognitive processes involved in reading code is crucial for identifying effective cognitive strategies, which can inform teaching methods and tooling support for developers. However, collecting large human subject eye tracking datasets, especially for programming tasks, is often costly and time-consuming, limiting its scalability and applicability. To address this issue, we present CodeACT-R, the first cognitive simulation framework tailored for code reading, based on the well-established Adaptive Control of Thought—Rational (ACT-R) architecture from cognitive science. CodeACT-R simulates how humans read code and requires only a small, manageable amount of human data to initiate the simulator design, offering a cost-effective and scalable alternative to traditional data collection methods like eye tracking.
Specifically, we first collected real human visual attention data from 48 programmers reading code using eye tracking. These data were then used to develop CodeACT-R, enabling the simulation of human-like code reading behaviors. Our evaluation demonstrates that CodeACT-R is capable of simulating visual attention patterns (i.e., scanpaths) that closely resemble real-world human attention patterns, also accounting for up to 87% of observed pattern variations.