Detecting and Repairing Incomplete Software Requirements with Multi-LLM Ensembles (ASE 2025 - New Ideas and Emerging Results Track)

Sun 16 - Thu 20 November 2025 Seoul, South Korea

Who

Mohamad Kassab, Marwan AbdElhameed

Track

ASE 2025 NIER Track

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 16:30 - 16:40 at Grand Hall 3 - Requirement Engineering

Abstract

Ensuring SRS completeness is critical to preventing costly downstream errors and rework. We introduce an automated tool that ensembles three complementary LLMs—DeepSeek Chat, GPT-4o Mini, and Claude Sonnet 4—to detect and suggest remedies for missing requirements. The tool first generates a structured domain model from the SRS, then runs parallel external and internal completeness analysis using carefully crafted prompts. Users select which LLMs to invoke and choose among majority voting, weighted voting, or a Meta-LLM fusion to aggregate outputs. In experiments on four SRSs with seeded omissions, single models achieved only 0–52% recall, while our full ensemble consistently exceeded 75% (up to 100%) recall with 95–100% suggestion plausibility. These early findings highlight the potential of multi-LLM ensembles to dramatically outperform individual models and support next-generation requirements analysis tools through effective human-in-the-loop refinement.

Mohamad Kassab

Boston University, USA

Marwan AbdElhameed

New York University Abu Dhabi