Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion (InteNSE 2024)

Track

InteNSE 2024 InteNSE Workshop

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 15 Apr 2024 16:00 - 16:30 at Daciano da Costa - Late Afternoon Session Chair(s): Reyhaneh Jabbarvand, Saeid Tizpaz-Niari

Abstract

This paper investigates the ability of transformer-based models to learn structural recursion from examples. Recursion is a universal concept in both natural and formal languages. Structural recursion is central to the programming language and formal mathematics tasks where symbolic tools currently excel beyond neural models, such as inferring semantic relations between datatypes and emulating program behavior. We introduce a general framework that nicely connects the abstract concepts of structural recursion in the programming language domain to concrete sequence modeling problems and learned models’ behavior. The framework includes a representation that captures the general syntax of structural recursion, coupled with two different frameworks for understanding their semantics—one that is more natural from a programming languages perspective and one that helps bridge that perspective with a mechanistic understanding of the underlying transformer architecture. With our framework as a powerful conceptual tool, we identify different issues under various set-ups. The models trained to emulate recursive computations cannot fully capture the recursion yet instead fit short-cut algorithms and thus cannot solve certain edge cases that are under-represented in the training distribution. In addition, it is difficult for state-of-theart large language models (LLMs) to mine recursive rules from in-context demonstrations. Meanwhile, these LLMs fail in interesting ways when emulating reduction (step-wise computation) of the recursive function.

Link to Preprint

https://arxiv.org/pdf/2401.12947.pdf

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 15 Apr
Displayed time zone: Lisbon change

16:00 - 17:30	Late Afternoon SessionInteNSE at Daciano da Costa Chair(s): Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign, Saeid Tizpaz-Niari University of Texas at El Paso

16:00 30m Talk		Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion InteNSE Shizhuo Zhang University of Illinois Urbana-Champaign Pre-print
16:30 30m Talk		SWE-bench: Can Language Models Resolve Real-World GitHub Issues? InteNSE John Yang Princeton Pre-print
17:00 30m Day closing		InteNSE 2024 Closing Remarks InteNSE Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

Mon 15 Apr
Displayed time zone: Lisbon change

Shizhuo Zhang

University of Illinois Urbana-Champaign

Tracks

Co-hosted Conferences

Workshops

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

Program Display Configuration

Program Display Configuration

Mon 15 AprDisplayed time zone: Lisbon change

Shizhuo Zhang

University of Illinois Urbana-Champaign

Mon 15 Apr
Displayed time zone: Lisbon change