CAIN 2023
Mon 15 - Sat 20 May 2023 Melbourne, Australia
co-located with ICSE 2023
Mon 15 May 2023 18:20 - 18:40 at Virtual - Zoom for CAIN - Data & Model Optimization Chair(s): Justus Bogner

Software that contains machine learning algorithms is an integral part of automotive perception, for example, in driving automation systems. The development of such software, specifically the training and validation of the machine learning components, require large annotated datasets. An industry of data and annotation services has emerged to serve the development of such data-intensive automotive software components. Wide-spread difficulties to specify data and annotation needs challenge collaborations between OEMs (Original Equipment Manufacturers) and their suppliers of software components, data, and annotations. This paper investigates the reasons for these difficulties for practitioners in the Swedish automotive industry to arrive at clear specifications for data and annotations. The results from an interview study show that a lack of effective metrics for data quality aspects, ambiguities in the way of working, unclear definitions of annotation quality, and deficits in the business ecosystems are causes for the difficulty in deriving the specifications. We provide a list of recommendations that can mitigate challenges when deriving specifications and we propose future research opportunities to overcome these challenges. Our work contributes towards the on-going research on accountability of machine learning as applied to complex software systems, especially for high-stake applications such as automated driving.

Mon 15 May

Displayed time zone: Hobart change

17:15 - 18:45
Data & Model OptimizationPapers / Posters / Industrial Talks at Virtual - Zoom for CAIN
Chair(s): Justus Bogner University of Stuttgart

Click here to Join us over zoom

Click here to watch the session recording on Youtube

17:15
15m
Short-paper
Automatically Resolving Data Source Dependency Hell in Large Scale Data Science Projects
Papers
Laurent Boué Microsoft, Pratap Kunireddy Microsoft, Pavle Subotic Microsoft Azure
Pre-print
17:30
15m
Short-paper
Dataflow graphs as complete causal graphs
Papers
Andrei Paleyes Department of Computer Science and Technology, Univesity of Cambridge, Siyuan Guo Max Planck Institute for Intelligent Systems, Bernhard Schölkopf MPI Tuebingen, Neil D. Lawrence Department of Computer Science and Technology, Univesity of Cambridge
Pre-print
17:45
20m
Long-paper
Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AIDistinguished paper Award Candidate
Papers
Tim Yarally Delft University of Technology, Luís Cruz Delft University of Technology, Daniel Feitosa University of Groningen, June Sallou Delft University of Technology, Arie van Deursen Delft University of Technology
Pre-print
18:05
15m
Short-paper
Prevalence of Code Smells in Reinforcement Learning Projects
Papers
Nicolás Cardozo Universidad de los Andes, Ivana Dusparic Trinity College Dublin, Ireland, Christian Cabrera Department of Computer Science and Technology, Univesity of Cambridge
Pre-print Media Attached
18:20
20m
Long-paper
Automotive Perception Software Development: An Empirical Investigation into Data, Annotation, and Ecosystem Challenges
Papers
Hans-Martin Heyn University of Gothenburg & Chalmers University of Technology, Khan Mohammad Habibullah University of Gothenburg, Eric Knauss Chalmers | University of Gothenburg, Jennifer Horkoff Chalmers and the University of Gothenburg, Markus Borg CodeScene, Alessia Knauss Zenseact AB, Polly Jing Li Kognic AB
Pre-print