CAIN 2024
Sun 14 - Mon 15 April 2024 Lisbon, Portugal
co-located with ICSE 2024
Mon 15 Apr 2024 09:12 - 09:15 at Pequeno Auditório - Keynote and Posters Chair(s): Jan Bosch, Henry Muccini

Foundation Models shift the interest to adapting models instead of creating proprietary models from scratch. Despite this change, performing hyperparameter optimization (HPO) is still needed. Users adapting systems powered by those models on proprietary data should not considerably increase the overall resource footprint with extensive hyperparameter search. Given that this footprint is also proportional to the data used in HPO, we aim to investigate how a user can effectively reduce the amount of data used, leveraging the deep learning model’s empirical facility to output the expected correct result for an item in the dataset.

In this work, we describe a methodology for accomplishing this data reduction through estimating a measure of an item’s difficulty. This method allows keeping only a portion of data that conserves the overall proportions of item difficulty throughout the dataset while helping order them meaningfully. The rationale is derived from results from curriculum learning research as we try to answer if the adapted models could help organize and select subsets of data representative of the whole. Preliminary results of evaluating the method are provided for image recognition and scientific name entity recognition (NER). We observe that the amount of data for HPO can be reduced as far as 60% and still point to the same choice of hyperparameters compared to using the whole training set.

Mon 15 Apr

Displayed time zone: Lisbon change

09:00 - 10:30
Keynote and PostersPosters / Research and Experience Papers at Pequeno Auditório
Chair(s): Jan Bosch Chalmers University of Technology, Henry Muccini University of L'Aquila, Italy
09:00
3m
Talk
A Domain Specific Language for Specification of Risk-oriented Object Detection Requirements
Posters
Junji Hashimoto GREE, Inc., Nobukazu Yoshioka Waseda University
09:03
3m
Talk
AI Security Continuum: Concept and Challenges
Posters
Hironori Washizaki Waseda University, Nobukazu Yoshioka Waseda University
09:06
3m
Talk
A Roadmap for Enriching Jupyter Notebooks Documentation with Kaggle Data
Posters
Mojtaba Mostafavi Department of Computer Engineering of Sharif University of Technology, Hamed Jahantigh Department of Computer Engineering of Sharif University of Technology, Alireza Asadi Department of Computer Engineering of Sharif University of Technology, Sepehr Kianian Department of Computer Engineering of Sharif University of Technology, Ashkan Khademian Department of Computer Engineering of Sharif University of Technology, Abbas Heydarnoori Bowling Green State University
09:09
3m
Talk
Automating Patch Set Generation from Code Reviews Using Large Language Models
Posters
Md Tajmilur Rahman Gannon University
09:12
3m
Talk
Data Selection Driven by Item Difficulty: On Investigating Data Efficient Practice for Hyperparameter Search
Posters
Gustavo Rodrigues dos Reis NAVER LABS Europe/LIG - UGA, Adrian Mos NAVER LABS Europe, Mario Cortes Cornax LIG - UGA, Cyril Labbé LIG - UGA
09:15
3m
Talk
Beyond Syntax: Unleashing the Power of Computational Notebooks Code Metrics in Documentation Generation
Posters
Mojtaba Mostafavi Department of Computer Engineering of Sharif University of Technology, Ashkan Khademian Department of Computer Engineering of Sharif University of Technology, Sepehr Kianian Department of Computer Engineering of Sharif University of Technology, Alireza Asadi Department of Computer Engineering of Sharif University of Technology, Hamed Jahantigh Department of Computer Engineering of Sharif University of Technology, Abbas Heydarnoori Bowling Green State University
09:18
3m
Talk
Can causality accelerate experimentation in software systems?
Posters
Andrei Paleyes Department of Computer Science and Technology, Univesity of Cambridge, Han-Bo Li Department of Computer Science and Technology, University of Cambridge, Neil D. Lawrence Department of Computer Science and Technology, Univesity of Cambridge
09:21
3m
Talk
Custom Developer GPT for Ethical AI Solutions
Posters
Lauren Olson Vrije Universiteit Amsterdam
Pre-print
09:24
3m
Talk
Evaluation of The Generality of Multi-view Modeling Framework for ML Systems
Posters
Jati H. Husen Waseda University, Japan, Jomphon Runpakprakun Waseda University, Japan, Sun Chang Waseda University, Japan, Hironori Washizaki Waseda University, Hnin Thandar Tun Waseda University, Japan, Nobukazu Yoshioka Waseda University, Japan, Yoshiaki Fukazawa Waseda University
09:27
3m
Talk
Prompt Smells: An Omen for Undesirable Generative AI Outputs
Posters
Krishna Ronanki University Of Gothenburg, Beatriz Cabrero-Daniel University of Gothenburg, Christian Berger Chalmers University of Technology, Sweden
09:30
3m
Talk
Taxonomy of Generative AI Applications for Risk Assessment
Posters
Hiroshi Tanaka Fujitsu Limited, Tokyo, Japan, Masaru Ide Fujitsu Limited, Jun Yajima Fujitsu Limited, Sachiko Onodera Fujitsu Limited, Kazuki Munakata Fujitsu Limited, Tokyo, Japan, Nobukazu Yoshioka Waseda University, Japan
09:35
55m
Keynote
Keynote by Christian Kästner - From Models to Systems: On the Role of Software Engineering for Machine Learning
Research and Experience Papers
Christian Kästner Carnegie Mellon University