ICSE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil

This program is tentative and subject to change.

Thu 16 Apr 2026 14:00 - 14:15 at Oceania IV - Human and Social Aspects 8 Chair(s): Ivan Beschastnikh

The ability to assess code proficiency is essential for understanding how developers comprehend and apply programming concepts effectively. With the rapid adoption of AI-assisted code generation tools, such as large language models (LLMs), this issue has become increasingly urgent. While these tools can produce functional code snippets on demand, developers must still comprehend, verify, and modify the generated code to ensure correctness and efficiency. The gap between code generation and code understanding exposes a critical challenge. Therefore, a systematic and objective way to measure code proficiency is vital not only for human learning and assessment but also for ensuring responsible and effective use of AI-generated code. Despite prior efforts to measure programming proficiency, such as the \textit{pycefr}, which adapts the Common European Framework of Reference for Languages (CEFR) to Python remain subjective. They rely on expert opinion or informal surveys, lacking empirical validation and a reproducible grounding in how developers actually comprehend programming constructs. This study addresses that gap by introducing an automated, data-driven framework to determine code proficiency levels using introductory programming textbooks as empirical ground truth. Our framework leverages the pedagogical sequencing of textbooks to derive proficiency levels for 138 Python code constructs. We propose two automated methods: (1) Übersequence, a consensus-based sequence that statistically models the progression of code constructs across textbooks, and (2) Clustering, a network-based grouping of constructs using the Louvain community detection algorithm. These methods reveal both the linear order of increasing conceptual difficulty and the relational grouping of constructs typically introduced together. Our results show strong consistency among textbooks, demonstrating significant positive correlation in construct sequencing and ensuring that textbooks are suitable as an empirical basis for measuring code proficiency. The clustering analysis yields six stable construct groups. Early groups represent basic constructs, while later ones capture advanced concepts. This work offers a robust alternative to subjective expert ratings and provides a methodological foundation to calibrate proficiency in the era of AI-generated code. Enabling future systems to assess, align, and personalize developer learning and code comprehension. By integrating empirical pedagogy with automated analysis, this framework advances the pursuit of reliable, scalable, and AI-aware models of code proficiency.

This program is tentative and subject to change.

Thu 16 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Human and Social Aspects 8Research Track / Journal-first Papers / New Ideas and Emerging Results (NIER) at Oceania IV
Chair(s): Ivan Beschastnikh The University of British Columbia
14:00
15m
Talk
Determining Code Proficiency Levels from Python Textbooks
Journal-first Papers
Ruksit Rojpaisarnkit Nara Institute of Science and Technology, Gregorio Robles Universidad Rey Juan Carlos, Jesus M. Gonzalez-Barahona Universidad Rey Juan Carlos, Kenichi Matsumoto Nara Institute of Science and Technology, Raula Gaikovina Kula The University of Osaka
14:15
15m
Talk
Guiding principles for mixed methods research in software engineering
Journal-first Papers
Margaret-Anne Storey University of Victoria, Rashina Hoda Monash University, Alessandra Maciel Paz Milani University of Victoria, Maria Teresa Baldassarre Department of Computer Science, University of Bari
14:30
15m
Talk
SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment
New Ideas and Emerging Results (NIER)
Priyavanshi Pathania Accenture Labs, Rohit Mehra Accenture Labs, Vibhu Saujanya Sharma Accenture Labs, Vikrant Kaulgud Accenture Labs, India, Tiffani Nevels Accenture, Sanjay Podder Accenture, Adam P. Burden Accenture
14:45
15m
Talk
Views on Internal and External Validity in Empirical Software Engineering: 10 Years Later and Beyond
Research Track
Alina Mailach Leipzig University, Janet Siegmund Chemnitz University of Technology, Sven Apel Saarland University, Norbert Siegmund Leipzig University
15:00
15m
Talk
Weak Programmers Need Not Apply, LLMs Welcome! Survey Screening in the AI Era
Research Track
Ita Ryan University College Cork, Utz Roedig School of Computer Science and Information Technology, University College Cork, Klaas-Jan Stol Lero; University College Cork; SINTEF Digital
15:15
15m
Talk
Sapling: Quantifying and Measuring the Maturity of the RISC-V Software Ecosystem
Research Track
Yuhang Liu Institute of Computing Technology, Chinese Academy of Sciences, Chenchen Ji Institute of Software, Chinese Academy of Sciences, Haoquan Li Institute of Computing Technology, Chinese Academy of Sciences, Jiageng Yu The Institute of Software, Chinese Academy of Sciences, Mingyu Chen Institute of Computing Technology, Chinese Academy of Sciences, Yanjun Wu Institute of Software, Chinese Academy of Sciences, Yungang Bao State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences