Documenting Ethical Considerations in Open Source AI Models
Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically compliant software. However, knowledge on such documentation practice remains scarce. Aims: The objective of our study is to investigate how developers document ethical aspects of open source AI models in practice, aiming at providing recommendations for future documentation endeavours. Method: We selected three sources of documentation on GitHub and Hugging Face, and developed a keyword set to identify ethics-related documents systematically. After filtering an initial set of 2,347 documents, we identified 265 relevant ones and performed thematic analysis to derive the themes of ethical considerations. Results: Six themes emerge, with the three largest ones being model behavioural risks, model use cases, and model risk mitigation. Conclusions: Our findings reveal that open source AI model documentation focuses on articulating ethical problem statements and use case restrictions. We further provide suggestions to various stakeholders for improving documentation practice regarding ethical considerations.
Thu 24 OctDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
11:00 - 12:35 | Open source software and repository miningESEM Technical Papers / ESEM Emerging Results, Vision and Reflection Papers Track at Multimedia (B3 Building - Hall) Chair(s): Davide Taibi University of Oulu | ||
11:00 20mFull-paper | Sustaining Maintenance Labor for Healthy Open Source Software Projects through Human Infrastructure: A Maintainer Perspective ESEM Technical Papers Johan Linåker RISE Research Institutes of Sweden, Georg Link Bitergia, Kevin Lumbard Creighton University | ||
11:20 20mFull-paper | Documenting Ethical Considerations in Open Source AI Models ESEM Technical Papers Haoyu Gao The University of Melbourne, Mansooreh Zahedi The Univeristy of Melbourne, Christoph Treude Singapore Management University, Sarita Rosenstock the University of Melbourne, Marc Cheong the University of Melbourne Pre-print | ||
11:40 20mFull-paper | An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software ESEM Technical Papers Lucas Franke Virginia Tech, Huayu Liang Virginia Tech, Sahar Farzanehpour Virginia Tech, Aaron Brantly Virginia Tech, James C. Davis Purdue University, Chris Brown Virginia Tech Pre-print | ||
12:00 20mFull-paper | An Empirical Study of API Misuses of Data-Centric Libraries ESEM Technical Papers Akalanka Galappaththi University of Alberta, Sarah Nadi New York University Abu Dhabi, University of Alberta, Christoph Treude Singapore Management University Pre-print | ||
12:20 15mVision and Emerging Results | Automatic Categorization of GitHub Actions with Transformers and Few-shot Learning ESEM Emerging Results, Vision and Reflection Papers Track Phuong T. Nguyen University of L’Aquila, Juri Di Rocco University of L'Aquila, Claudio Di Sipio University of L'Aquila, Mudita Shakya University of L'Aquila, Davide Di Ruscio University of L'Aquila, Massimiliano Di Penta University of Sannio, Italy Pre-print |