FSE 2026
Sun 5 - Thu 9 July 2026 Montreal, Canada

Large language models (LLMs) have achieved remarkable progress in code generation, yet the structural properties of programming languages introduce distinctive challenges. In particular, program correctness is disproportionately influenced by a subset of structurally critical tokens, such as API names, variable identifiers, and control-flow keywords, termed as influential tokens. Errors in predicting these tokens often propagate and accumulate through subsequent decoding steps, leading to substantial degradation in overall correctness. Addressing the heterogeneous difficulty of predicting such tokens is therefore crucial for improving the reliability of code generation. To address this challenge, we introduce Influence-Aware Bayesian Code Generation (I-BAYGEN), a framework that explicitly handles influential tokens. The framework consists of two components. First, it identifies influential tokens using a loss-based detection mechanism, and measures the influential degree of each token in three ways. Second, to handle the influential tokens, we explicitly steer additional reasoning paths as the evidence to obtain the posterior token distribution during code generation, which can be treated as Bayesian inference. To optimize the posterior likelihood, we involve the influence scores as weights to enhance the self-rewarding, making the LLM pay more attention to the identified influential tokens. Using the influence-based reweighting mechanism, the framework provides differentiated treatment to tokens based on their difficulty, with influential tokens receiving enhanced attention through refined reward structures and deeper reasoning processes. Comprehensive experiments on competition-level programming benchmarks demonstrate that I-BAYGEN achieves up to 47.2% relative improvement in accuracy over state-of-the-art approaches employing non-weighted rewards. Moreover, qualitative analysis reveals that the framework produces reasoning paths that are more interpretable and logically coherent, effectively addressing heterogeneous token difficulty in code generation.