Write a Blog >>
ICSE 2023
Sun 14 - Sat 20 May 2023 Melbourne, Australia
Fri 19 May 2023 11:00 - 12:30 at Meeting Room 112 - Technical Briefing 6

Appropriate representation of source code and its relevant properties form the backbone of Artificial Intelligence (AI)/ Machine Learning (ML) pipelines for various software engineering tasks such as \textit{code classification}, \textit{bug prediction}, \textit{code clone detection}, and \textit{code summarization}. In the literature, researchers have extensively experimented with different kinds of source code representations (syntactic, semantic, integrated, customized) and properties ranging from tree/graph representations such as Abstract Syntax Trees (ASTs) to pre-trained transformer models like CodeBERT. In addition, it is common for researchers to create hand-crafted and customized source code representations for an appropriate software engineering task. In a 2018 survey, Allamanis et al. listed ~35 different ways of source code representations for different software engineering (SE) tasks like ASTs, customized ASTs, Control Flow Graphs (CFGs), Data Flow Graphs (DFGs) and so on. The main goal of this tutorial is two-fold (i) Present an overview of the state-of-the-art of source code representations and corresponding ML pipelines with an explicit focus on the pros and cons of each of the representations (ii) Practical challenges in infusing different code views in the state-of-the-art ML models.

Fri 19 May

Displayed time zone: Hobart change

11:00 - 12:30
Technical Briefing 6Technical Briefings at Meeting Room 112
11:00
90m
Talk
The Landscape of Source Code Representation Learning in AI-Driven Software Engineering Tasks
Technical Briefings
Sridhar Chimalakonda IIT Tirupati, Debeshee Das Indian Institute of Technology Tirupati, Alex Mathai IBM India Research Labs, Srikanth Tamilselvam IBM Research, Atul Kumar IBM India Research Labs