Enterprise-Scale COBOL-to-Java Translation: LLMs Augmented with Program Analysis
Modern enterprise systems often rely on legacy COBOL applications whose rigid, monolithic structure and globally scoped variables impede maintainability and cloud-native adoption. In this paper, we present a hybrid COBOL-to-Java translation pipeline that combines static program analysis with large language models (LLMs) to deliver scalable, consistent, and idiomatic Java code at enterprise scale. This translation pipeline is a core component of IBM’s watsonx Code Assistant for Z (WCA for Z) product. Our Class Designer and Method Designer modules perform global analysis to infer Java classes, hierarchies, and method signatures from COBOL data divisions and control-flow graphs, generating metadata that guides the LLM-based translation of procedural logic.
We evaluate our pipeline on 20 GenAPP programs and 10 proprietary customer applications, comparing it against three standalone LLMs using an automated LLM-as-Judge framework. We show that our translation pipeline delivers a typical (median) structural quality score of over 80% and a functional score above 75%, while also reducing variability in results and enforcing uniform naming and design across all modules. Our translation pipeline, deployed across numerous client environments as part of IBM’sWCA for Z, provides a cost-effective solution for application modernization—bridging the COBOL skills gap and accelerating enterprise-wide transformation.