Human-Centered Agentic Framework for Machine Learning Modeling in Finance
The advent of large language models has ushered in a new era of agentic systems, where artificial intelligence programs exhibit remarkable autonomous decision-making capabilities across diverse domains. This paper explores agentic system workflows in the financial services industry. In particular, we build agentic crews with human-in-the-loop orchestrator that can effectively collaborate to perform complex machine learning modeling tasks. The modeling crew consists of a judge agent and multiple agents who perform specific tasks such as exploratory data analysis, feature engineering, model selection/hyperparameter tuning, model training, model evaluation, and writing documentation. We demonstrate the effectiveness and robustness of modeling crews by presenting a comparative experiment applied to the detection of credit card fraud and card portfolio credit risk. Our fraud detection experiment achieved a recall of 81.6% and an F1-score of 88.9%, outperforming AutoML’s 70.4% and 82.1%, respectively.