Documenting Ethical Considerations in Open Source AI Models (ESEIW 2024 - ESEM Technical Papers Track)

Who

Haoyu Gao, Mansooreh Zahedi, Christoph Treude, Sarita Rosenstock, Marc Cheong

Track

ESEIW 2024 ESEM Technical Papers

Time Zone

The program is currently displayed in (GMT+02:00) Brussels, Copenhagen, Madrid, Paris.

Use conference time zone: (GMT+02:00) Brussels, Copenhagen, Madrid, ParisSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 24 Oct 2024 11:20 - 11:40 at Multimedia (B3 Building - Hall) - Open source software and repository mining Chair(s): Davide Taibi

Abstract

Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically compliant software. However, knowledge on such documentation practice remains scarce. Aims: The objective of our study is to investigate how developers document ethical aspects of open source AI models in practice, aiming at providing recommendations for future documentation endeavours. Method: We selected three sources of documentation on GitHub and Hugging Face, and developed a keyword set to identify ethics-related documents systematically. After filtering an initial set of 2,347 documents, we identified 265 relevant ones and performed thematic analysis to derive the themes of ethical considerations. Results: Six themes emerge, with the three largest ones being model behavioural risks, model use cases, and model risk mitigation. Conclusions: Our findings reveal that open source AI model documentation focuses on articulating ethical problem statements and use case restrictions. We further provide suggestions to various stakeholders for improving documentation practice regarding ethical considerations.

Link to Preprint

https://arxiv.org/pdf/2406.18071

Haoyu Gao

The University of Melbourne

Australia

Mansooreh Zahedi

The Univeristy of Melbourne

Australia

Christoph Treude

Singapore Management University

Singapore

Sarita Rosenstock

the University of Melbourne

Australia

Marc Cheong

the University of Melbourne

Australia

Time Zone

The program is currently displayed in (GMT+02:00) Brussels, Copenhagen, Madrid, Paris.

Use conference time zone: (GMT+02:00) Brussels, Copenhagen, Madrid, ParisSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 24 Oct
Displayed time zone: Brussels, Copenhagen, Madrid, Paris change

11:00 - 12:35	Open source software and repository miningESEM Technical Papers / ESEM Emerging Results, Vision and Reflection Papers Track at Multimedia (B3 Building - Hall) Chair(s): Davide Taibi University of Oulu

11:00 20m Full-paper		Sustaining Maintenance Labor for Healthy Open Source Software Projects through Human Infrastructure: A Maintainer Perspective ESEM Technical Papers Johan Linåker RISE Research Institutes of Sweden, Georg Link Bitergia, Kevin Lumbard Creighton University
11:20 20m Full-paper		Documenting Ethical Considerations in Open Source AI Models ESEM Technical Papers Haoyu Gao The University of Melbourne, Mansooreh Zahedi The Univeristy of Melbourne, Christoph Treude Singapore Management University, Sarita Rosenstock the University of Melbourne, Marc Cheong the University of Melbourne Pre-print
11:40 20m Full-paper		An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software ESEM Technical Papers Lucas Franke Virginia Tech, Huayu Liang Virginia Tech, Sahar Farzanehpour Virginia Tech, Aaron Brantly Virginia Tech, James C. Davis Purdue University, Chris Brown Virginia Tech Pre-print
12:00 20m Full-paper		An Empirical Study of API Misuses of Data-Centric Libraries ESEM Technical Papers Akalanka Galappaththi University of Alberta, Sarah Nadi New York University Abu Dhabi, University of Alberta, Christoph Treude Singapore Management University Pre-print
12:20 15m Vision and Emerging Results		Automatic Categorization of GitHub Actions with Transformers and Few-shot Learning ESEM Emerging Results, Vision and Reflection Papers Track Phuong T. Nguyen University of L’Aquila, Juri Di Rocco University of L'Aquila, Claudio Di Sipio University of L'Aquila, Mudita Shakya University of L'Aquila, Davide Di Ruscio University of L'Aquila, Massimiliano Di Penta University of Sannio, Italy Pre-print