EASE 2023
Tue 13 - Fri 16 June 2023 Oulu, Finland
Wed 14 Jun 2023 10:30 - 10:50 at Aurora Hall - AI and Software Engineering Chair(s): Valentina Lenarduzzi

Data quality assessment has become a prominent component in the successful execution of complex data-driven artificial intelligence (AI) software systems. In practice, real-world applications generate huge volumes of data at speeds. These data streams require analysis and preprocessing before being permanently stored or used in a learning task. Therefore, significant attention has been paid to the systematic management and construction of high-quality datasets. Nevertheless, managing voluminous and velocious data streams is usually performed manually (i.e., offline), making it impractical strategy in production environments. To address this challenge, DataOps has emerged to achieve life-cycle automation of data processes using DevOps principles. However, determining the data quality based on a fitness scale constitutes a complex task within the framework of DataOps. This paper presents a novel Data Quality Scoring Operations (DQSOps) framework that yields a quality score for production data in DataOps workflows. The framework incorporates two scoring approaches, an ML prediction-based approach that predicts the data quality score and a standard-based approach that periodically produces the ground-truth scores based on assessing several data quality dimensions. We deploy the DQSOps framework in a real-world industrial use case. The results show that DQSOps achieves a significant computational speedup rates compared to the conventional approach of data quality scoring while maintaining high prediction performance.

Presentation (EASE_DQSOps_2023.pdf)6.37MiB

Wed 14 Jun

Displayed time zone: Athens change

10:30 - 12:00
10:30
20m
Paper
DQSOps: Data Quality Scoring Operations Framework for Data-Driven Applications
Research (Full Papers)
Firas Bayram Karlstad University, Bestoun S. Ahmed Karlstad University, Erik Hallin Uddeholms AB, Sweden, Anton Engman Uddeholms AB, Sweden
Pre-print Media Attached File Attached
10:50
10m
Paper
PAFL: Probabilistic Automaton-based Fault Localization for Recurrent Neural Networks
Journal First
Yuta Ishimoto Kyushu University, Masanari Kondo Kyushu University, Naoyasu Ubayashi Kyushu University, Yasutaka Kamei Kyushu University
Link to publication DOI File Attached
11:00
20m
Paper
Implementing AI Ethics: Making Sense of the Ethical Requirements
Research (Full Papers)
Mamia Agbese University of Jyväskylä, Jyväskylä, Finland, Pekka Abrahamsson University of Jyväskylä, Rahul Mohanani University of Jyväskylä, Arif Ali Khan
Pre-print Media Attached File Attached
11:20
10m
Short-paper
Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smellsShort Paper
Short Papers and Posters
Anh Ho Hanoi University of Science and Technology, Anh M. T. Bui Hanoi University of Science and Technology, Phuong T. Nguyen University of L’Aquila, Amleto Di Salle European University of Rome
DOI Authorizer link Media Attached File Attached
11:30
20m
Paper
Classification-based Static Collection Selection for Java: Effectiveness and Adaptability
Research (Full Papers)
Noric Couderc Lund University, Christoph Reichenbach Lund University, Emma Söderberg Lund University
Authorizer link Pre-print Media Attached File Attached
11:50
10m
Paper
Too long; didn't read: Automatic summarization of GitHub README.MD with Transformers
Vision and Emerging Results
Thu T. H. Doan VNU University of Engineering and Technology, Phuong T. Nguyen University of L’Aquila, Juri Di Rocco University of L'Aquila, Davide Di Ruscio University of L'Aquila
DOI Authorizer link Media Attached File Attached