ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Fri 19 Apr 2024 14:15 - 14:30 at Almada Negreiros - Language Models and Generated Code 3 Chair(s): Jie M. Zhang

Deep learning models are trained with certain assumptions about the data during the development stage and then used for prediction in the deployment stage. It is important to reason about the trustworthiness of the model’s predictions with unseen data during deployment. Existing methods for specifying and verifying traditional software are insufficient for this task, as they cannot handle the complexity of DNN model architecture and expected outcomes. In this work, we propose a novel technique that uses rules derived from neural network computations to infer data preconditions for a DNN model to determine the trustworthiness of its predictions. Our approach, DeepInfer involves introducing a novel abstraction for a trained DNN model that enables weakest precondition reasoning using Dijkstra’s Predicate Transformer Semantics. By deriving rules over the inductive type of neural network abstract representation, we can overcome the matrix dimensionality issues that arise from the backward non-linear computation from the output layer to the input layer. We utilize the weakest precondition computation using rules of each kind of activation function to compute layer-wise precondition from the given postcondition on the final output of a deep neural network. We extensively evaluated DeepInfer on 29 real-world DNN models using four different datasets collected from five different sources and demonstrated the utility, effectiveness, and performance improvement over closely related work. DeepInfer efficiently detects correct and incorrect predictions of high-accuracy models with high recall (0.98) and high F-1 score (0.84) and has significantly improved over the prior technique, SelfChecker. The average runtime overhead of DeepInfer is low, 0.22 sec for all the unseen datasets. We also compared runtime overhead using the same hardware settings and found that DeepInfer is 3.27 times faster than SelfChecker, the state-of-the-art in this area.

Fri 19 Apr

Displayed time zone: Lisbon change

14:00 - 15:30
Language Models and Generated Code 3Research Track / Demonstrations at Almada Negreiros
Chair(s): Jie M. Zhang King's College London
14:00
15m
Talk
CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models
Research Track
Hao Yu Peking University, Bo Shen Huawei Cloud Computing Technologies Co., Ltd., Dezhi Ran Peking University, Jiaxin Zhang Huawei Cloud Computing Technologies Co., Ltd., Qi Zhang Huawei Cloud Computing Technologies Co., Ltd., Yuchi Ma Huawei Cloud Computing Technologies CO., LTD., Guangtai Liang Huawei Cloud Computing Technologies, Ying Li School of Software and Microelectronics, Peking University, Beijing, China, Qianxiang Wang Huawei Technologies Co., Ltd, Tao Xie Peking University
14:15
15m
Talk
Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment
Research Track
Shibbir Ahmed Iowa State University, Hongyang Gao Dept. of Computer Science, Iowa State University, Hridesh Rajan Iowa State University
14:30
15m
Talk
GrammarT5: Grammar-Integrated Pretrained Encoder-Decoder Neural Model for Code
Research Track
Qihao Zhu Peking University, Qingyuan Liang Peking University, Zeyu Sun Institute of Software, Chinese Academy of Sciences, Yingfei Xiong Peking University, Lu Zhang Peking University, Shengyu Cheng ZTE Corporation
14:45
15m
Talk
On Calibration of Pre-trained Code models
Research Track
Zhenhao Zhou Fudan University, Chaofeng Sha Fudan University, Xin Peng Fudan University
DOI Media Attached
15:00
15m
Talk
Learning in the Wild: Towards Leveraging Unlabeled Data for Effectively Tuning Pre-trained Code Models
Research Track
Shuzheng Gao , Wenxin Mao Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Li Li Beihang University, Xing Hu Zhejiang University, Xin Xia Huawei Technologies, Michael Lyu The Chinese University of Hong Kong
15:15
7m
Talk
GitHubInclusifier: Finding and fixing non-inclusive language in GitHub Repositories
Demonstrations
Liam Todd Monash University, John Grundy Monash University, Christoph Treude Singapore Management University
Pre-print Media Attached