iKnow: an Intent-Guided Chatbot for Cloud Operations with Retrieval-Augmented Generation (ASE 2025 - Research Papers)

Who

Junjie Huang, Yuedong Zhong, Guangba Yu, Zhihan Jiang, Minzhi Yan, Wenfei Luan, Tianyu Yang, Rui Ren, Michael Lyu

Track

ASE 2025 Research Papers

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 18 Nov 2025 11:30 - 11:40 at Vista - SE4AI & AI4SE 2

Abstract

Managing complex cloud services requires standard operational documentation, but its sheer volume often hinders cloud engineers from efficient knowledge acquisition. Retrieval-Augmented Generation (RAG) can streamline this process by retrieving relevant knowledge and generating concise, referenced answers. However, deploying a reliable RAG-based chatbot for cloud operation remains a challenge. In this experience paper, we analyze the development and deployment of RAG-based chatbots for operational question answering (OpsQA) at a large-scale cloud vendor. Through an empirical study of 2,000 real-world queries across three operational teams, we identify five unique OpsQA intent types (e.g., symptom analysis and terminology explanation) and their corresponding requirements for a satisfactory answer, which differ from general software engineering queries. Our analysis further uncovers six root causes leading to chatbot failures—over half stem from query issues (i.e., incompleteness, out-of-scope, or invalid queries), while others are from retrieval or generation issues. To address these issues, we propose iKnow, an intent-guided RAG-based chatbot that integrates intent detection, query rewriting tailored to each intent, and missing knowledge detection to enhance answer quality. In internal evaluations, iKnow improves average answer accuracy from 65.8% to 81.3% with only a modest increase in latency. iKnow has been deployed for six months at CloudA, supporting thousands of cloud engineers in daily operations. We discuss lessons learned from real-world deployment, providing valuable insights for future research and practical implementations in similar domains.

Junjie Huang

The Chinese University of Hong Kong

Hong Kong SAR China

Yuedong Zhong

Sun Yat-sen University

Guangba Yu

The Chinese University of Hong Kong

Hong Kong SAR China

Zhihan Jiang

The Chinese University of Hong Kong

Minzhi Yan

HCC Lab, Huawei Cloud Computing Technology Co., Ltd

Wenfei Luan

HCC Lab, Huawei Cloud Computing Technology Co., Ltd

Tianyu Yang

HCC Lab, Huawei Cloud Computing Technology Co., Ltd

Rui Ren

Computing and Networking Innovation Lab, Huawei Cloud Computing Technology Co., Ltd

Michael Lyu

The Chinese University of Hong Kong

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 18 Nov
Displayed time zone: Seoul change

11:00 - 12:30	SE4AI & AI4SE 2Research Papers at Vista

11:00 10m Talk		Learning Project-wise Subsequent Code Edits via Interleaving Neural-based Induction and Tool-based Deduction Research Papers Chenyan Liu Shanghai Jiao Tong University; National University of Singapore, Yun Lin Shanghai Jiao Tong University, Yuhuan Huang Shanghai Jiao Tong University, Jiaxin Chang Shanghai Jiao Tong University, Binhang Qi National University of Singapore, Bo Jiang Bytedance Network Technology, Zhiyong Huang National University of Singapore, Jin Song Dong National University of Singapore
11:10 10m Talk		Coding-Fuse: Efficient Fusion of Code Pre‑Trained Models for Classification Tasks Research Papers Yu Zhao , Lina Gong Nanjing University of Aeronautics and Astronautic, Zhiqiu Huang Nanjing University of Aeronautics and Astronautics, Yuchen Jin Nanjing University of Aeronautics and Astronautics, Mingqiang Wei Nanjing University of Aeronautics and Astronautics
11:20 10m Talk		SE-Jury: An LLM-as-Ensemble-Judge Metric for Narrowing the Gap with Human Evaluation in SE Research Papers Xin Zhou Singapore Management University, Singapore, Kisub Kim DGIST, Ting Zhang Monash University, Martin Weyssow Singapore Management University, Luis F. Gomes Carnegie Mellon University, Guang Yang , Kui Liu Huawei, Xin Xia Zhejiang University, David Lo Singapore Management University
11:30 10m Talk		iKnow: an Intent-Guided Chatbot for Cloud Operations with Retrieval-Augmented Generation Research Papers Junjie Huang The Chinese University of Hong Kong, Yuedong Zhong Sun Yat-sen University, Guangba Yu The Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Minzhi Yan HCC Lab, Huawei Cloud Computing Technology Co., Ltd, Wenfei Luan HCC Lab, Huawei Cloud Computing Technology Co., Ltd, Tianyu Yang HCC Lab, Huawei Cloud Computing Technology Co., Ltd, Rui Ren Computing and Networking Innovation Lab, Huawei Cloud Computing Technology Co., Ltd, Michael Lyu The Chinese University of Hong Kong
11:40 10m Talk		Aligning LLMs to Fully Utilize the Cross-file Context in Repository-level Code Completion Research Papers Jia Li Tsinghua University, Hao Zhu Peking University, Huanyu Liu , Xianjie Shi Peking University, He Zong aiXcoder, Yihong Dong Peking University, Kechi Zhang Peking University, China, Siyuan Jiang , Zhi Jin Peking University, Ge Li Peking University
11:50 10m Talk		From Sparse to Structured: A Diffusion-Enhanced and Feature-Aligned Framework for Coincidental Correctness Detection Research Papers Huan Xie Chongqing University, Chunyan Liu Chongqing University, Yan Lei Chongqing University, Zhenyu Wu School of Big Data & Software Engineering, Chongqing University, Jinping Wang Chonqing University
12:00 10m Talk		Watson: A Cognitive Observability Framework for the Reasoning of LLM-Powered Agents Research Papers Benjamin Rombaut Centre for Software Excellence, Huawei Canada, Sogol Masoumzadeh Mcgill University, Kirill Vasilevski Huawei Canada, Dayi Lin Centre for Software Excellence, Huawei Canada, Ahmed E. Hassan Queen’s University
12:10 10m Talk		Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories Research Papers Islem BOUZENIA University of Stuttgart, Michael Pradel CISPA Helmholtz Center for Information Security
12:20 10m Talk		Triangle: Empowering Incident Triage with Multi-Agent Research Papers Zhaoyang Yu Tsinghua University, Aoyang Fang Chinese University of Hong Kong, Shenzhen, Minghua Ma Microsoft, Jaskaran Singh Walia Microsoft, Chaoyun Zhang Microsoft, Shu Chi Tsinghua University, Ze Li Microsoft Azure, Murali Chintalapati Microsoft Azure, Xuchao Zhang Microsoft, Rujia Wang Microsoft, Chetan Bansal Microsoft Research, Saravan Rajmohan Microsoft, Qingwei Lin Microsoft, Shenglin Zhang Nankai University, Dan Pei Tsinghua University, Pinjia He Chinese University of Hong Kong, Shenzhen