In the fast-evolving field of artificial intelligence, Reinforcement Learning (RL) plays a crucial role in developing agents that can make decisions. As these systems become increasingly complex, the need for standardized and automated training methods becomes apparent. This paper presents a rule-based framework that integrates Large Language Models (LLMs) and heuristic-based code detectors to ensure compliance with best practices in RL training pipelines. We define a set of architectural rules that target best practices in important areas of RL-based architectures, such as checkpoints, hyperparameter tuning, and agent configuration. We validated our approach through a large-scale industrial case study and ten open-source projects. The results show that LLM-based detectors generally outperform heuristic-based detectors, especially when handling more complex code patterns. This approach effectively identifies best practices with high precision and recall, demonstrating its practical applicability.