Replacing Training with Reasoning: Reinterpreting Classic ML Pipelines with LLMs
Large Language Models (LLMs) are increasingly used in software engineering tasks due to their strong performance across diverse applications. In this paper, we ask a fundamental and novel question: \textit{To what extent can LLMs replace traditional machine learning pipelines that rely on labeled data, feature engineering, and retraining?}
Our intuition is that many long-standing approaches in software engineering can be reimagined through the lens of reasoning rather than training. Unlike conventional pipelines that learn statistical patterns from data, LLMs can directly reason about contextual consistency using their pretrained knowledge. To illustrate this idea, we revisit a well-known anomaly detection pipeline (CHABADA for Android apps) and show how its clustering and retraining stages can be replaced with a simple prompting strategy. The result is a streamlined, zero-shot workflow that leverages semantic reasoning without labeled datasets, feature extraction, or retraining.
Our goal is not to propose a new tool, but to highlight a broader paradigm: LLMs open the door to reinterpreting established ML-based workflows as reasoning pipelines. This perspective suggests a path toward lighter-weight, training-free alternatives for many specialized software engineering tasks.
Fri 17 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
14:00 - 15:30 | AI for Software Engineering 25Journal-first Papers / Research Track / New Ideas and Emerging Results (NIER) / Demonstrations at Europa II Chair(s): Daniel Feitosa University of Groningen | ||
14:00 15mTalk | ArtifactSync: Automated Repository Synchronization through Hierarchical Change Impact Analysis Demonstrations Ebube Alor Concordia University, João Pedro de Souza Olivo Tardivo Universidade Estadual do Paraná, SayedHassan Khatoonabadi Concordia University, Emad Shihab Concordia University | ||
14:15 15mTalk | Introducing Phylogenetics in Search-based Software Engineering: Phylogenetics-aware SBSE Journal-first Papers Daniel Blasco SVIT Research Group. Universidad San Jorge, Antonio Iglesias Universidad San Jorge, Jorge Echeverria Universidad San Jorge, Francisca Perez Universitat Politècnica de València, Carlos Cetina | ||
14:30 15mTalk | Automating Terraform Code Migration through Provider Evolution Knowledge New Ideas and Emerging Results (NIER) Pranjal Gupta IBM Research, Pooja Aggarwal IBM Research, Brent Paulovicks IBM Research, Prateeti Mohapatra IBM Research, Rong Lee IBM Research, Vadim Sheinin IBM Research | ||
14:45 15mTalk | Replacing Training with Reasoning: Reinterpreting Classic ML Pipelines with LLMs New Ideas and Emerging Results (NIER) Marco Alecci University of Luxembourg, Jordan Samhi University of Luxembourg, Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg | ||
15:00 15mTalk | NB2P: Generating Data Science Pipelines from Computational Notebooks Research Track Haotian Gao National University of Singapore, Singapore and NUSRI Chongqing, China, Quang Trung Ta National University of Singapore, Tien Tuan Anh Dinh Deakin University, Australia, Nhut Minh Ho National University of Singapore, Zhiyong Huang National University of Singapore, Beng Chin Ooi National University of Singapore, Singapore Media Attached | ||
15:15 15mTalk | Multi-Location Software Model Completion Research Track | ||