Using Pre-trained Language Models to Resolve Textual and Semantic Merge Conflicts (Experience Paper) (ISSTA 2022 - Technical Papers)

Who

Jialu Zhang, Todd Mytkowicz, Mike Kaufman, Ruzica Piskac, Shuvendu K. Lahiri

Track

ISSTA 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 20 Jul 2022 03:00 - 03:20 at ISSTA 1 - Session 1-3: Oracles, Models, and Measurement A
Wed 20 Jul 2022 09:20 - 09:40 at ISSTA 1 - Session 2-3: Oracles, Models, and Measurement B

Abstract

Program merging is standard practice when developers integrate their individual changes to a common code base. When the merge algorithm fails, this is called a merge conflict. The conflict either manifests as a textual merge conflict where the merge fails to produce code, or as a semantic merge conflict where the merged code results in compiler or test being broken. Resolving these conflicts for large code projects is expensive because it requires developers to manually identify the sources of conflicts and correct them.

In this paper, we explore the feasibility of automatically repairing merge conflicts (both textual and semantic) using k-shot learning with large neural language models (LM) such as GPT-3. One of the challenges in leveraging such language models is fitting the examples and the queries within a small prompt (2048 tokens). We evaluate LMs and k-shot learning for both textual and semantic merge conflicts for a divergent fork Microsoft Edge. Our results are mixed: on one-hand, LMs provide the state-of-the-art (SOTA) performance on semantic merge conflict resolution for Edge compared to earlier symbolic approaches; on the other hand, LMs do not yet obviate the benefits of special purpose domain-specific languages (DSL) for restricted patterns for program synthesis.

DOI

https://doi.org/10.1145/3533767.3534396

Jialu Zhang

Yale University

Todd Mytkowicz

Microsoft Research

United States

Mike Kaufman

Microsoft Corporation

Ruzica Piskac

Yale University

United States

Shuvendu K. Lahiri

Microsoft Research

United States

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 20 Jul
Displayed time zone: Seoul change

03:00 - 04:00	Session 1-3: Oracles, Models, and Measurement ATechnical Papers at ISSTA 1

03:00 20m Talk		Using Pre-trained Language Models to Resolve Textual and Semantic Merge Conflicts (Experience Paper) Technical Papers Jialu Zhang Yale University, Todd Mytkowicz Microsoft Research, Mike Kaufman Microsoft Corporation, Ruzica Piskac Yale University, Shuvendu K. Lahiri Microsoft Research DOI
03:20 20m Talk		Metamorphic Relations via Relaxations: An Approach to Obtain Oracles for Action-Policy Testing Technical Papers Hasan Ferit Eniser MPI-SWS, Timo P. Gros Saarland University, Germany, Valentin Wüstholz ConsenSys, Jörg Hoffmann Saarland University and DFKI, Germany, Maria Christakis MPI-SWS DOI Pre-print
03:40 20m Talk		An Extensive Study on Pre-trained Models for Program Understanding and Generation Technical Papers Zhengran Zeng Southern University of Science and Technology, Hanzhuo Tan Southern University of Science and Technology, The Hong Kong Polytechnic University, Haotian Zhang , Jing Li The Hong Kong Polytech University, Yuqun Zhang Southern University of Science and Technology, Lingming Zhang University of Illinois at Urbana-Champaign DOI

08:40 - 09:40	Session 2-3: Oracles, Models, and Measurement BTechnical Papers at ISSTA 1

08:40 20m Talk		TELL: Log Level Suggestions via Modeling Multi-level Code Block Information Technical Papers Jiahao Liu National University of Singapore, Jun Zeng National University of Singapore, Xiang Wang University of Science and Technology of China, Kaihang Ji National University of Singapore, Zhenkai Liang National University of Singapore DOI
09:00 20m Talk		Hunting Bugs with Accelerated Optimal Graph Vertex Matching Technical Papers Xiaohui Zhang Renmin University of China, Yuanjun Gong Renmin University of China, Bin Liang Renmin University of China, China, Jianjun Huang Renmin University of China, China, Wei You Renmin University of China, Wenchang Shi Renmin University of China, China, Jian Zhang Institute of Software at Chinese Academy of Sciences, China DOI
09:20 20m Talk		Using Pre-trained Language Models to Resolve Textual and Semantic Merge Conflicts (Experience Paper) Technical Papers Jialu Zhang Yale University, Todd Mytkowicz Microsoft Research, Mike Kaufman Microsoft Corporation, Ruzica Piskac Yale University, Shuvendu K. Lahiri Microsoft Research DOI