Automating the Generation of Hardware Component Knowledge Bases (LCTES 2019 - Languages, Compilers, Tools and Theory of Embedded Systems)

Who

Luke Hsiao, Sen Wu, Nicholas Chiang, Christopher Ré, Philip Levis

Track

LCTES 2019

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 23 Jun 2019 14:45 - 15:00 at 105A - Session 3: Applications Chair(s): Wanli Chang

Abstract

Hardware component databases are critical resources in designing embedded systems. Since generating these databases requires hundreds of thousands of hours of manual data entry, they are proprietary, limited in the data they provide, and have many random data entry errors.

We present a machine-learning based approach for automating the generation of component databases directly from datasheets. Extracting data directly from datasheets is challenging because: (1) the data is relational in nature and relies on non-local context, (2) the documents are filled with technical jargon, and (3) the datasheets are PDFs, a format that decouples visual locality from locality in the document. The proposed approach uses a rich data model and weak supervision to address these challenges.

We evaluate the approach on datasheets of three classes of hardware components and achieve an average quality of 75 F1 points which is comparable to existing human-curated knowledge bases. We perform two applications studies that demonstrate the extraction of multiple data modalities such as numerical properties and images. We show how different sources of supervision such as heuristics and human labels have distinct advantages which can be utilized together within a single methodology to automatically generate hardware component knowledge bases.

Luke Hsiao

Stanford University

United States

Sen Wu

Stanford University

Nicholas Chiang

Gunn High School

Christopher Ré

Philip Levis

Stanford University

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sun 23 Jun
Displayed time zone: Tijuana, Baja California change

14:45 - 15:30	Session 3: ApplicationsLCTES 2019 at 105A Chair(s): Wanli Chang University of York

14:45 15m Full-paper		Automating the Generation of Hardware Component Knowledge Bases LCTES 2019 Luke Hsiao Stanford University, Sen Wu Stanford University, Nicholas Chiang Gunn High School, Christopher Ré , Philip Levis Stanford University
15:00 15m Full-paper		IA-Graph Based Inter-App Conflicts Detection in Open IoT Systems LCTES 2019 Xinyi Li Chang'an University, Lei Zhang North Carolina State University, Xipeng Shen North Carolina State University
15:15 15m Full-paper		ApproxSymate: Path Sensitive Program Approximation using Symbolic Execution LCTES 2019 Himeshi Praveeni De Silva , Andrew Santosa National University of Singapore, Nhut Minh Ho National University of Singapore, Weng-Fai Wong National University of Singapore