Data Catalog Tools: A Systematic Multivocal Literature Review
This program is tentative and subject to change.
A data catalog enables an organization to maintain an inventory of its data assets by collecting and managing the relevant metadata. We conducted a systematic multi-vocal literature review on data catalogs to understand their features and usage. We systematically selected and analyzed 86 literature sources and 39 catalog tools. We first utilized the findings from the literature to develop a classification framework comprising 24 fine-grained and five high-level features, along with three maturity levels. Next, we analyzed 39 tools based on the classification framework. Organizations typically include a data catalog as a component in their big data platforms and use it to support the various phases of the metadata management lifecycle. Hence, we also mapped the catalog features to the requirements of metadata-driven big data architectures, namely data mesh, data lake, and data lakehouse. Moreover, the mappings of the features to the phases in a metadata management lifecycle were developed. Our findings shall aid organizations in making informed decisions when choosing data catalog tools and help researchers identify the critical research issues in data cataloging and metadata management.
This program is tentative and subject to change.
Fri 20 MarDisplayed time zone: Athens change
11:00 - 12:30 | Session 6A - Tools and Techniques for Effective Software DevelopmentIndustrial Track / Journal First Track / Tool Demo Track / Research Track at Panorama | ||
11:00 15mTalk | How Natural Language Proficiency Shapes GenAI Code for Software Engineering Tasks Journal First Track Ruksit Rojpaisarnkit Nara Institute of Science and Technology, Youmei Fan Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology, Raula Gaikovina Kula The University of Osaka | ||
11:15 15mTalk | Data Catalog Tools: A Systematic Multivocal Literature Review Journal First Track Marco Tonnarelli JADS - TU/e, Indika Kumara Tilburg University, Stefan Driessen JADS, Tilburg University, Damian Andrew Tamburri University of Sannio - JADS/NXP Semiconductors, Willem-Jan van den Heuvel JADS, Tilburg University, Patrick Oor NXP Semiconductors | ||
11:30 15mTalk | On the Practical Adoption of a Static Performance Anti-Pattern Detector: An Industrial Case Study Industrial Track Lizhi Liao University of Guelph, Weiyi Shang University of Waterloo, Catalin Sporea ERA Environmental Management Solutions, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions | ||
11:45 15mTalk | Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases Industrial Track Indira Vats University of Toronto; Advanced Micro Devices (AMD), Sanjukta De Advanced Micro Devices, Subhayan Roy , Saurabh Bodhe , Lejin Varghese , Max Kiehn , Yonas Bedasso Advanced Micro Devices, Marsha Chechik University of Toronto Pre-print | ||
12:00 15mTalk | Diagram-Aware Automatic Review of Software Design Documents Using Multimodal Large Language Models Industrial Track | ||
12:15 7mTalk | Source Code-Driven GDPR Documentation: Supporting RoPA with Assessor View Tool Demo Track Mugdha Khedkar Heinz Nixdorf Institute, Paderborn University, Michael Schlichtig Heinz Nixdorf Institut, Paderborn University, Eric Bodden Heinz Nixdorf Institute at Paderborn University & Fraunhofer IEM Pre-print Media Attached | ||
12:22 7mTalk | RefineID: A Developer-Centric IDE Assistant for Better Identifiers Tool Demo Track | ||