Towards Reliable LLM-based Exam Generation. Lessons Learned and Open Challenges in an Industrial Project (ASE 2025 - Industry Showcase)

Sun 16 - Thu 20 November 2025 Seoul, South Korea

Who

Renzo Degiovanni, Jordi Cabot

Track

ASE 2025 Industry Showcase

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 18 Nov 2025 16:20 - 16:30 at Grand Hall 3 - Human and Social Aspects 2

Abstract

Large Language Models (LLMs) have revolutionized the way natural language tasks are handled, with big potential applications in the context of education. LLMs can save educators time and effort, for instance, in content creation and exam generation. Although promising, LLMs’ integration into educational products brings some risks that companies must mitigate. In the context of an industrial project, we investigate the effectiveness of LLMs to generate educational multiple-choice questions. The experiments include 16 commercial and opensource LLMs, rely on standard metrics to assess the accuracy (F1 and BLEU) and linguistic quality (perplexity and diversity) of the generated questions, and compare with five specialized models. The results suggest that recent LLMs can outperform the fine-tuned models for question generation, open-source LLMs are very competitive with the commercial ones, with Meta Llama models being the best performing, and DeepSeek as performing as recent GPT4 models. This promising empirical evidence encourages us to focus on advanced prompting strategies, for which we report relevant open challenges we aim to address in the short term.

Renzo Degiovanni

Luxembourg Institute of Science and Technology

Luxembourg

Jordi Cabot