Type annotations improve Python code quality by enabling better readability, static analysis, and developer productivity. However, manually annotating existing code is labor-intensive and error-prone. While recent learning-based models have advanced automatic type inference, they struggle with rare or complex types that are underrepresented in training data.
We present TypeCare, a model-agnostic post-processing technique that refines the outputs of existing type inference models using code context, without requiring retraining or fine-tuning. TypeCare combines two key components: (1) Re-Ranking, which prioritizes semantically and syntactically relevant types, and (2) Augmentation, which generates additional contextually plausible candidates. Applied to three state-of-the-art type inference models—TypeT5, Tiger, and TypeGen—TypeCare consistently improves top-1 accuracy, achieving up to 40.1% gains on complex types that existing models often fail to predict correctly.
Hanmo You Tianjin University, Zan Wang Tianjin University, Zishuo Dong College of Intelligence and Computing, Tianjin University, Luanqi Mo College of Intelligence and Computing, Tianjin University, Jianjun Zhao Kyushu University, Junjie Chen Tianjin University