ICPC 2024
Sun 14 - Sat 20 April 2024 Lisbon, Portugal
co-located with ICSE 2024

Pre-trained Machine Learning (ML) models help to create ML-intensive systems without having to spend conspicuous resources on training a new model from the ground up. However, the lack of transparency for such models could lead to undesired consequences in terms of bias, fairness, trustworthiness of the underlying data, and, potentially even legal implications. Taking as a case study the transformer models hosted by Hugging Face, a popular hub for pre-trained ML models, this paper empirically investigates the transparency of pre-trained transformer models. We look at the extent to which model descriptions (i) specify the datasets being used for their pre-training, (ii) discuss their possible training bias, (iii) declare their license, and whether projects using such models take these licenses into account. Results indicate that pre-trained models still have a limited exposure of their training datasets, possible biases, and adopted licenses. Also, we found several cases of possible licensing violations by client projects. Our findings motivate further research to improve the transparency of ML models, which may result in the definition, generation, and adoption of Artificial Intelligence Bills of Materials.

Tue 16 Apr

Displayed time zone: Lisbon change

14:00 - 15:30
New Frontiers - Virtual Reality, Mobile Apps, Smart Contracts, and LLMsEarly Research Achievements (ERA) / Tool Demonstration / Research Track / at Sophia de Mello Breyner Andresen
Chair(s): Sonia Haiduc Florida State University
14:00
10m
Talk
The Sword of Damocles: Upgradeable Smart Contract in EthereumICPCICPC Full paperVirtual-Talk
Research Track
Yuan Huang School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China, Xiaoyuan Wu Sun Yat-sen University, Quanqi Wang Sun Yat-sen University, Ziang Qian Sun Yat-sen University, Xiangping Chen Sun Yat-sen University, Mingdong Tang Guangdong University of Foreign Studies, Zibin Zheng Sun Yat-sen University
14:10
10m
Talk
Collaborative Software Exploration with Multimedia Note Taking in Virtual RealityICPCICPC Full paper
Research Track
Adrian Hoff IT University of Copenhagen, Mircea Lungu IT University, Copenhagen, Christoph Seidl IT University of Copenhagen, Michele Lanza Software Institute - USI, Lugano
Pre-print Media Attached
14:20
10m
Talk
No Source Code? No Problem! Demystifying and Detecting Mask Apps in iOSICPCICPC Full paper
Research Track
Yijun Zhao Institute of Information Engineering, Chinese Academy of Sciences, Lingjing Yu Institute of Information Engineering, Chinese Academy of Sciences, Yong Sun Institute of Information Engineering, Chinese Academy of Sciences, Qingyun Liu Institute of Information Engineering, Chinese Academy of Sciences, Bo Luo The University of Kansas
Pre-print
14:30
10m
Talk
How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical StudyICPCICPC Full paper
Research Track
Federica Pepe University of Sannio, Vittoria Nardone University of Molise, Antonio Mastropaolo Università della Svizzera italiana, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Gerardo Canfora University of Sannio, Massimiliano Di Penta University of Sannio, Italy
Pre-print
14:40
8m
Talk
Capturing and Understanding the Drift Between Design, Implementation, and DocumentationICPCICPC ERA Paper
Early Research Achievements (ERA)
Joseph Romeo Software Institute - USI, Lugano, Switzerland, Marco Raglianti Software Institute - USI, Lugano, Csaba Nagy Software Institute - USI, Lugano, Michele Lanza Software Institute - USI, Lugano
Pre-print
14:48
8m
Talk
Immersive Software Archaeology: Collaborative Exploration and Note Taking in Virtual RealityICPCICPC Tools
Tool Demonstration
Adrian Hoff IT University of Copenhagen, Mircea F. Lungu University of Groningen, Christoph Seidl IT University of Copenhagen, Michele Lanza Software Institute - USI, Lugano
Pre-print Media Attached
14:56
34m
Talk
New Frontiers - Virtual Reality, Mobile Apps, Smart Contracts, and LLMs: Panel with SpeakersICPC
Discussion