SeeAction: Towards Reverse Engineering How-What-Where of HCI Actions from Screencasts for UI Automation (ICSE 2025 - Research Track)

Who

Dehai Zhao, Zhenchang Xing, Qinghua Lu, Xiwei (Sherry) Xu, Liming Zhu

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Apr 2025 11:30 - 11:45 at 214 - AI for Testing and QA 1 Chair(s): Jieshan Chen

Abstract

UI automation is a useful technique for UI testing, bug reproduction and robotic process automation. Recording the user actions with an application assists rapid development of UI automation scripts, but existing recording techniques are intrusive, rely on OS or GUI framework accessibility support or assume specific app implementations. Reversing-engineering user actions from screencasts is non-intrusive, but a key reverse-engineering step is currently missing - recognize human-understandable structured user actions ([command] [widget][location]) from action screencasts. To fill the gap, we propose a deep learning-based computer vision model which can recognize 11 commands and 11 widgets, and generate location phrases from action screencasts, through joint learning and multi-task learning. We label a large dataset with 7260 video-action pairs, which record the user interactions with Word, Zoom, Firefox, Photoshop, and Window 10 Settings. Through extensive experiments, we confirm the effectiveness and generality of our model, and demonstrate the usefulness of a screencast-to-action-script tool built upon our model for bug reproduction.

Dehai Zhao

CSIRO's Data61

Zhenchang Xing

CSIRO's Data61

Qinghua Lu

Data61, CSIRO

Australia

Xiwei (Sherry) Xu

Data61, CSIRO

Australia

Liming Zhu

CSIRO’s Data61

Australia

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 30 Apr
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	AI for Testing and QA 1Research Track / SE In Practice (SEIP) at 214 Chair(s): Jieshan Chen CSIRO's Data61

11:00 15m Talk		Does GenAI Make Usability Testing Obsolete?Award Winner Research Track Ali Ebrahimi Pourasad , Walid Maalej University of Hamburg Pre-print
11:15 15m Talk		Feature-Driven End-To-End Test Generation Research Track Parsa Alian University of British Columbia, Noor Nashid University of British Columbia, Mobina Shahbandeh University of British Columbia, Taha Shabani University of British Columbia, Ali Mesbah University of British Columbia
11:30 15m Talk		SeeAction: Towards Reverse Engineering How-What-Where of HCI Actions from Screencasts for UI AutomationAward Winner Research Track Dehai Zhao CSIRO's Data61, Zhenchang Xing CSIRO's Data61, Qinghua Lu Data61, CSIRO, Xiwei (Sherry) Xu Data61, CSIRO, Liming Zhu CSIRO’s Data61
11:45 15m Talk		Synthesizing Document Database Queries using Collection Abstractions Research Track Qikang Liu Simon Fraser University, Yang He Simon Fraser University, Yanwen Cai Simon Fraser University, Byeongguk Kwak Simon Fraser University, Yuepeng Wang Simon Fraser University
12:00 15m Talk		The Power of Types: Exploring the Impact of Type Checking on Neural Bug Detection in Dynamically Typed Languages Research Track Boqi Chen McGill University, José Antonio Hernández López Linköping University, Gunter Mussbacher McGill University, Daniel Varro Linköping University / McGill University
12:15 15m Talk		DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production SE In Practice (SEIP) Xiaoyun Liang ByteDance, Jingyi Ren ByteDance, Jiayi Qi ByteDance, Chao Peng ByteDance, Bo Jiang Bytedance Network Technology