SEAMS 2026
Mon 13 - Tue 14 April 2026 Rio de Janeiro, Brazil
co-located with ICSE 2026

Reward design has been one of the central challenges for real world reinforcement learning (RL) deployment, especially in settings with multiple objectives. Preference-based RL offers an appealing alternative by learning from human preferences over pairs of behavioural outcomes. More recently, RL from AI feedback (RLAIF) has demonstrated that large language models (LLMs) can generate preference labels at scale, mitigating the reliance on human annotators. However, existing RLAIF work typically focuses only on single-objective tasks, leaving the open question of how RLAIF handles systems that involve multiple objectives. In such systems trade-offs among conflicting objectives are difficult to specify, and policies risk collapsing into optimizing for a dominant goal. In this paper, we explore the extension of the RLAIF paradigm to multi-objective self-adaptive systems. We show that multi-objective RLAIF can produce policies that yield balanced trade-offs reflecting different user priorities without laborious reward engineering. We argue that integrating RLAIF into multi-objective RL offers a scalable path toward user-aligned policy learning in domains with inherently conflicting objectives.

Mon 13 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30
Learning-Based, Causality-Aware & Sustainable AdaptationResearch Track / Artifact Track / SEAMS Program at Oceania II
Chair(s): Sona Ghahremani Hasso Plattner Institute, University of Potsdam
11:00
15m
Talk
Ripple: A Long-Sighted Self-Adaptation Approach to Retrain Machine Learning-Enabled SystemsBest Student Paper AwardFull Paper
Research Track
Maria Casimiro INESC-ID, IST, University of Lisbon & S3D, Carnegie Mellon University, Valentim Romão INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Paolo Romano University of Lisbon, Portugal, Luis Rodrigues INESC-ID, IST, ULisboa, David Garlan Carnegie Mellon University
11:15
10m
Talk
Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI FeedbackShort Paper
Research Track
Chenyang Zhao Trinity College Dublin, Vinny Cahill Trinity College Dublin, Ivana Dusparic Trinity College Dublin, Ireland
File Attached
11:25
15m
Talk
MAPER: Extending MAPE-K with LLM-Based Reasoning to Manage Unanticipated Situations in Self-Adaptive SystemsFull Paper
Research Track
Paulo Maia State University of Ceará, Lucas Vieira State University of Ceará, Gabriel Luiz Barros De Oliveira State University of Ceará - UECE, Matheus Chagas State University of Ceará, Alan Bandeira State University of Ceará - UECE, Cleilton Rocha Atlantico Institute
11:40
10m
Talk
Robust Exploration in Directed Controller Synthesis via Mixture-of-Experts Reinforcement LearningExtended Abstract
Research Track
Toshihide Uubukata Waseda University, Mingyue Zhang Southwest University, Zhiyao Wang The University of Osaka, NIANYU LI ZGC Lab, China, Jialong Li Waseda University, Japan, Kenji Tei Institute of Science Tokyo
11:50
15m
Talk
RAMNA: A Resource-Aware Algorithm for Maximizing Availability in Flying Ad-Hoc NetworksFull Paper
Research Track
Miguel Catarro Universidade de Lisboa, Luis Pinto Universidade de Lisboa, Alan Oliveira Universidade de Lisboa
12:05
10m
Talk
Harmonica: A Self-Adaptation Exemplar for Sustainable MLOpsArtifact
Artifact Track
Ananya Vishal Halgatti IIIT-Hyderabad, Shaunak Biswas IIIT Hyderabad, Hiya Bhatt IIIT Hyderabad, Srinivasan Rakhunathan Microsoft, India, Karthik Vaidhyanathan IIIT Hyderabad
Pre-print Media Attached
12:15
15m
Talk
CRAFTER: Causality-based Self-Adaptation for Autonomous IoT SystemsFull PaperVirtual Attendance
Research Track
Houssam Hajj Hassan Orange Innovation, Ajay Kattepur , Denis Conan SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Georgios Bouloukakis Department of Electrical and Computer Engineering, University of Patras, Greece
Pre-print