ProML: A Decentralised Architecture for Provenance Management of Machine Learning Software Systems (ECSA 2022 - Research Papers)

Who

Nguyen Khoi Tran, Bushra Sabir, Muhammad Ali Babar, Nini Cui, Mehran Abolhasan, Justin Lipman

Track

ECSA 2022 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague.

Use conference time zone: (GMT+02:00) Belgrade, Bratislava, Budapest, Ljubljana, PragueSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 23 Sep 2022 09:00 - 09:05 at S4 - Architecting for data-driven systems Chair(s): Jan Carlson, Anne Koziolek

Abstract

Large-scale Machine Learning (ML) based Software Systems are increasingly developed by distributed teams situated in different trust domains. Insider threats can launch attacks from any domain to compromise ML assets (models and datasets). Therefore, practitioners require information about how and by whom ML assets were developed to assess their quality attributes such as security, safety, and fairness. Unfortunately, it is challenging for ML teams to access and reconstruct such historical information of ML assets (ML provenance) because it is generally fragmented across distributed ML teams and threatened by the same adversaries that attack ML assets. This paper proposes ProML, a decentralised platform that leverages blockchain and smart contracts to empower distributed ML teams to jointly manage a single source of truth about circulated ML assets’ provenance without relying on a third party, which is vulnerable to insider threats and presents a single point of failure. We propose a novel architectural approach called Artefact-as-a-State-Machine to leverage blockchain transactions and smart contracts for managing ML provenance information and introduce a user-driven provenance capturing mechanism to integrate existing scripts and tools to ProML without compromising participants’ control over their assets and toolchains. We evaluate the performance and overheads of ProML by benchmarking a proof-of-concept system on a global blockchain. Furthermore, we assessed ProML’s security against a threat model of a distributed ML workflow.

Nguyen Khoi TranAuthor

The University of Adelaide

Bushra SabirAuthor

Muhammad Ali BabarAuthor

The University of Adelaide

Nini CuiAuthor

Mehran AbolhasanAuthor

Justin LipmanAuthor

Time Zone

The program is currently displayed in (GMT+02:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague.

Use conference time zone: (GMT+02:00) Belgrade, Bratislava, Budapest, Ljubljana, PragueSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 23 Sep
Displayed time zone: Belgrade, Bratislava, Budapest, Ljubljana, Prague change

09:00 - 10:30	Architecting for data-driven systemsResearch Papers / Tools & Demos / Industry Program at S4 Chair(s): Jan Carlson Malardalen University, Anne Koziolek Karlsruhe Institute of Technology

09:00 5m Full-paper		ProML: A Decentralised Architecture for Provenance Management of Machine Learning Software SystemsBest paper candidate Research Papers A: Nguyen Khoi Tran The University of Adelaide, A: Bushra Sabir , A: Muhammad Ali Babar The University of Adelaide, A: Nini Cui , A: Mehran Abolhasan , A: Justin Lipman
09:05 5m Full-paper		A systematic survey of architectural approaches and trade-offs in data de-identification Research Papers A: Dimitri Van Landuyt KU Leuven, Belgium, A: Wouter Joosen Katholieke Universiteit Leuven
09:10 5m Full-paper		Accurate Performance Predictions with Component-based Models of Data Streaming ApplicationsBest paper candidate Research Papers A: Dominik Werle Karlsruhe Institute of Technology, A: Stephan Seifermann Karlsruhe Institute of Technology, A: Anne Koziolek Karlsruhe Institute of Technology
09:15 5m Demonstration		DAT: A Tool for Data Architecture for IoT Tools & Demos A: Moamin Abughazala University of L'Aquila, A: Henry Muccini University of L'Aquila, Italy, A: Mohammad Sharaf Media Attached
09:20 5m Short-paper		Blockchain-based Architecture of Immutable Document Repository Industry Program A: Szymon Kijas , A: Andrzej Zalewski
09:25 65m Other		Discussion Research Papers

Information for Participants

Fri 23 Sep 2022 09:00 - 10:30 at S4 - Architecting for data-driven systems Chair(s): Jan Carlson, Anne Koziolek

Info for session

Each paper is presented as a 5-minute pitch talk at the beginning. The rest of the session is a discussion.

Info for room S4:

After reaching the 3rd floor (either by elevator or the main staircase), turn right.