Securing LLM-based Software Supply Chains (ASE 2023 - SATE - Software Engineering at the Era of LLMs)

Track

ASE 2023 SATE - Software Engineering at the Era of LLMs

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 14 Sep 2023 14:00 - 14:40 at Room FR - SATE - Software Engineering at the Era of LLMs Chair(s): Xin Xia

Abstract

Abstract: LLMs are increasingly used not just for autocompletion, but also for code generation from natural language and APIs and other tasks. The output they produce, however, is based on the input data that is nominally permissively licensed, but is not curated for quality, security, performance, or other factors, such as whether the code’s license is authentic. This leads to buggy, insecure, poorly performing, or inappropriately licensed output that is already poisoning the rapidly growing OSS codebase. Problematic inputs will result in problematic outputs even if all the LLM hallucinations were to be removed, hence stronger provenance tracking and quality assurance for LLM training and fine-tuning inputs is essential to improve quality of the generated code. We suggest approaches to use World of Code research infrastructure to curate LLM training data via de-duplicating and auto curating source code based on the OSS-wide software supply chain properties derived from the nearly complete collection of OSS source code.

Audris Mockus is the Ericsson-Harlan D. Mills Chair Professor of Digital Archeology and Evidence Engineering in the Department of Electrical Engineering and Computer Science of the University of Tennessee, Knoxville and Senior Scientist at Vilnius University. He studies software developers’ culture and behavior through the recovery, documentation, and analysis of digital remains, in other words, Digital Archaeology. These digital traces reflect projections of collective and individual activity. He reconstructs the reality from these projections by designing data mining methods to summarize and augment these digital traces, interactive visualization techniques to inspect, present, and control the behavior of teams and individuals, and statistical models and optimization techniques to understand the nature of individual and collective behavior.

File attachments

presentation (wocllm.pptx (1).pdf)	322KiB

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 14 Sep
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

13:20 - 15:20	SATE - Software Engineering at the Era of LLMsSATE - Software Engineering at the Era of LLMs at Room FR Chair(s): Xin Xia Huawei Technologies

13:20 40m Talk		Towards Better Software Quality in the Era of Large Language Models SATE - Software Engineering at the Era of LLMs Lingming Zhang University of Illinois at Urbana-Champaign
14:00 40m Talk		Securing LLM-based Software Supply Chains SATE - Software Engineering at the Era of LLMs Audris Mockus Vilnius University & The University of Tennessee File Attached
14:40 40m Talk		BEWARE: some of the deep learning rhetoric is misleading SATE - Software Engineering at the Era of LLMs Tim Menzies North Carolina State University Pre-print

Securing LLM-based Software Supply Chains

Program Display Configuration

Program Display Configuration

Thu 14 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Audris Mockus

Vilnius University & The University of Tennessee

Thu 14 Sep
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change