From Task to Tutorial: An Automated GUI Framework for Excel Tutorial Document and Video Creation
This program is tentative and subject to change.
Excel is one of the most widely used productivity tools across domains, offering rich functionality but also overwhelming users with its complexity. This creates a persistent demand for tutorials to support effective usage. However, while building and maintaining the Microsoft tutorial corpus, we observed that existing tutorials are manually created by experts, need frequent updates with each software release, and involve substantial human labor. Moreover, prior work has not achieved fully automated tutorial generation. In this paper, we present the first framework for automatically generating Excel tutorials directly from natural language task descriptions. Our framework first instantiates the task. Then a central component of this framework, Execution Agent, plans and executes the solution in Excel, and collects the intermediate artifacts required for tutorial construction. These artifacts are then transformed into both structured Excel documents and video demonstrations. To build a comprehensive tutorial corpus, we collected 1,559 task descriptions from real-world scenarios. In addition, we designed a systematic evaluation framework that integrates assessments from both large language models (LLMs) and human reviewers. Experimental results show that our framework improves task execution success rates by 8.5% over state-of-the-art baselines. Moreover, the generated tutorials demonstrate superior readability and instructional effectiveness, often approaching or surpassing expert-authored materials. Importantly, the automated pipeline eliminates manual labor and reduces time costs to 1/20 of expert authoring, making scalable and high-quality tutorial generation practical for the first time.
This program is tentative and subject to change.
Wed 8 JulDisplayed time zone: Eastern Time (US & Canada) change
16:00 - 17:00 | |||
16:00 20mTalk | Look Before You Leap: Context-Sensitive GUI Grounding for Boosting Automated Extended Reality (XR) Testing Research Papers Shuqing Li The Chinese University of Hong Kong, Binchang Li Harbin Institute of Technology, Yepang Liu Southern University of Science and Technology, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Jianping Zhang The Chinese University of Hong Kong, Shing-Chi Cheung Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong | ||
16:20 20mTalk | EfficientUICoder: A Bidirectional Token Compression Framework for Efficient MLLM-based UI Code Generation Research Papers Jingyu Xiao The Chinese University of Hong Kong, Zhongyi Zhang Huazhong University of Science and Technology, China, Yuxuan Wan The Chinese University of Hong Kong, Yintong Huo Singapore Management University, Singapore, Yang Liu Nanyang Technological University, Michael Lyu The Chinese University of Hong Kong | ||
16:40 20mTalk | From Task to Tutorial: An Automated GUI Framework for Excel Tutorial Document and Video Creation Industry Papers Yuhang Xie Peking University, Jian Mu Nanjing University, Ma Xiaojun Microsoft, Chaoyun Zhang Microsoft, Lu Wang Microsoft Research, Mengyu Zhou Microsoft, Mugeng Liu Peking University, Si Qin Microsoft Research, Qingwei Lin Microsoft, Saravan Rajmohan Microsoft, Shi Han Microsoft Research, Dongmei Zhang Microsoft | ||