Rule-Based Assessment of Reinforcement Learning Practices Using Large Language Models (CAIN 2025 - Research and Experience Papers)

Who

Evangelos Ntentos, Stephen John Warnett, Uwe Zdun

Track

CAIN 2025 Research and Experience Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 27 Apr 2025 11:00 - 11:15 at 208 - Engineering AI systems with LLMs Chair(s): Justus Bogner

Abstract

In the fast-evolving field of artificial intelligence, Reinforcement Learning (RL) plays a crucial role in developing agents that can make decisions. As these systems become increasingly complex, the need for standardized and automated training methods becomes apparent. This paper presents a rule-based framework that integrates Large Language Models (LLMs) and heuristic-based code detectors to ensure compliance with best practices in RL training pipelines. We define a set of architectural rules that target best practices in important areas of RL-based architectures, such as checkpoints, hyperparameter tuning, and agent configuration. We validated our approach through a large-scale industrial case study and ten open-source projects. The results show that LLM-based detectors generally outperform heuristic-based detectors, especially when handling more complex code patterns. This approach effectively identifies best practices with high precision and recall, demonstrating its practical applicability.

Evangelos Ntentos

University of Vienna

Austria

Stephen John Warnett

University of Vienna

Austria

Uwe Zdun

University of Vienna

Austria

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sun 27 Apr
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	Engineering AI systems with LLMsResearch and Experience Papers at 208 Chair(s): Justus Bogner Vrije Universiteit Amsterdam

11:00 15m Talk		Rule-Based Assessment of Reinforcement Learning Practices Using Large Language ModelsDistinguished paper Award Candidate Research and Experience Papers Evangelos Ntentos University of Vienna, Stephen John Warnett University of Vienna, Uwe Zdun University of Vienna
11:15 10m Talk		Designing and implementing LLM guardrails components in production environments Research and Experience Papers Mateus Devino IBM, Evaline Ju IBM, Paulo Marques Caldeira Junior IBM
11:25 15m Talk		Themes of Building LLM-based Applications for Production: A Practitioner's View Research and Experience Papers Alina Mailach Leipzig University, Sebastian Simon Leipzig University, Johannes Dorn Leipzig University, Norbert Siegmund Leipzig University Pre-print
11:40 15m Talk		InsightAI: Root Cause Analysis in Large Hierarchical Log Files with Private Data Using Large Language Models Research and Experience Papers Maryam Ekhlasi Polytechnique Montreal, Anurag Prakash Ciena, Michel Dagenais Polytechnique Montréal, Maxime Lamothe Polytechnique Montreal
11:55 10m Talk		Developing Multi-Agent LLM Applications through Continuous Human-LLM Co-Programming Research and Experience Papers Hui Song SINTEF Digital, Arda Goknil SINTEF Digital, Xiaojun Jiang Oslo University Hospital, Espen Melum Oslo University Hospital, Hyunwhan Joe Seoul National University, Caterina Gazzotti University of Modena, Valerio Frascolla Intel, Adela Nedisan Videsjorden SINTEF, Phu Nguyen SINTEF
12:05 25m Other		Discussion Research and Experience Papers