InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript (ICSE 2022 - SEIP - Software Engineering in Practice)

Write a Blog >>

Sun 8 - Fri 27 May 2022

Who

Saikat Dutta, Diego Garbervetsky, Shuvendu K. Lahiri, Max Schaefer

Track

ICSE 2022 SEIP - Software Engineering in Practice

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 9 May 2022 21:05 - 21:10 at ICSE room 5 - Program Analysis 3 Chair(s): Travis Breaux
Tue 10 May 2022 13:10 - 13:15 at ICSE room 3 - Program Analysis 4 Chair(s): Miguel Goulao
Thu 26 May 2022 11:10 - 11:15 at Room 306+307 - Papers 14: Program Analysis Chair(s): Frank Tip

Abstract

Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the taint-tracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities.

But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort. Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learning-based approaches have shown great promise.

In this work we present our experience combining an existing machine-learning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub’s CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives.

Link to Preprint

https://arxiv.org/pdf/2111.09625.pdf

DOI

https://doi.org/10.1145/3510457.3513048

Saikat Dutta

University of Illinois at Urbana-Champaign

United States

Diego Garbervetsky

University of Buenos Aires and CONICET, Argentina

Argentina

Shuvendu K. Lahiri

Microsoft Research

United States

Max Schaefer

GitHub, Inc.

United Kingdom

InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Spec Inference for JS

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 9 May
Displayed time zone: Eastern Time (US & Canada) change

21:00 - 22:00	Program Analysis 3Technical Track / SEIP - Software Engineering in Practice / Journal-First Papers at ICSE room 5 Chair(s): Travis Breaux Carnegie Mellon University

5m Talk		Learning to Find Usages of Library Functions in Optimized Binaries Journal-First Papers Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis, Anand Ashok Sawant University of California, Davis Link to publication DOI Pre-print Media Attached
5m Talk		InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript SEIP - Software Engineering in Practice Saikat Dutta University of Illinois at Urbana-Champaign, Diego Garbervetsky University of Buenos Aires and CONICET, Argentina, Shuvendu K. Lahiri Microsoft Research, Max Schaefer GitHub, Inc. DOI Pre-print Media Attached
5m Talk		Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for PythonNominated for Distinguished Paper Technical Track Yun Peng The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Zongjie Li The Hong Kong University of Science and Technology, Bowei Gao Harbin Institute of Technology, Shenzhen, David Lo Singapore Management University, Qirun Zhang Georgia Institute of Technology, USA, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached
5m Talk		DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs Technical Track Mohammad Wardat Dept. of Computer Science, Iowa State University, Breno Dantas Cruz Dept. of Computer Science, Iowa State University, Wei Le Iowa State University, Hridesh Rajan Iowa State University Pre-print Media Attached
5m Talk		Striking a Balance: Pruning False-Positives from Static Call GraphsNominated for Distinguished Paper Technical Track Akshay Utture University of California, Los Angeles (UCLA), Shuyang Liu University of California, Los Angeles, Christian Gram Kalhauge Technical University of Denmark, Jens Palsberg University of California at Los Angeles DOI Pre-print Media Attached

Tue 10 May
Displayed time zone: Eastern Time (US & Canada) change

13:00 - 14:00	Program Analysis 4SEIP - Software Engineering in Practice / Technical Track / Journal-First Papers / NIER - New Ideas and Emerging Results at ICSE room 3 Chair(s): Miguel Goulao NOVA-LINCS, FCT/UNL

5m Talk		TaintBench: Automatic Real-World Malware Benchmarking of Android Taint Analyses Journal-First Papers Linghui Luo Amazon Web Services, Felix Pauck Paderborn University, Germany, Goran Piskachev Fraunhofer IEM, Manuel Benz Paderborn University, Ivan Pashchenko University of Trento, Martin Mory Paderborn University, Eric Bodden , Ben Hermann Technical University Dortmund, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam Link to publication DOI Pre-print Media Attached File Attached
5m Talk		Statistical Reasoning About Programs NIER - New Ideas and Emerging Results Marcel Böhme MPI-SP, Germany and Monash University, Australia DOI Pre-print Media Attached
5m Talk		InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript SEIP - Software Engineering in Practice Saikat Dutta University of Illinois at Urbana-Champaign, Diego Garbervetsky University of Buenos Aires and CONICET, Argentina, Shuvendu K. Lahiri Microsoft Research, Max Schaefer GitHub, Inc. DOI Pre-print Media Attached
5m Talk		Striking a Balance: Pruning False-Positives from Static Call GraphsNominated for Distinguished Paper Technical Track Akshay Utture University of California, Los Angeles (UCLA), Shuyang Liu University of California, Los Angeles, Christian Gram Kalhauge Technical University of Denmark, Jens Palsberg University of California at Los Angeles DOI Pre-print Media Attached
5m Talk		DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs Technical Track Mohammad Wardat Dept. of Computer Science, Iowa State University, Breno Dantas Cruz Dept. of Computer Science, Iowa State University, Wei Le Iowa State University, Hridesh Rajan Iowa State University Pre-print Media Attached

Thu 26 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	Papers 14: Program AnalysisTechnical Track / SEIP - Software Engineering in Practice / Journal-First Papers at Room 306+307 Chair(s): Frank Tip Northeastern University

11:00 5m Talk		Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for PythonNominated for Distinguished Paper Technical Track Yun Peng The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Zongjie Li The Hong Kong University of Science and Technology, Bowei Gao Harbin Institute of Technology, Shenzhen, David Lo Singapore Management University, Qirun Zhang Georgia Institute of Technology, USA, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached
11:05 5m Talk		TaintBench: Automatic Real-World Malware Benchmarking of Android Taint Analyses Journal-First Papers Linghui Luo Amazon Web Services, Felix Pauck Paderborn University, Germany, Goran Piskachev Fraunhofer IEM, Manuel Benz Paderborn University, Ivan Pashchenko University of Trento, Martin Mory Paderborn University, Eric Bodden , Ben Hermann Technical University Dortmund, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam Link to publication DOI Pre-print Media Attached File Attached
11:10 5m Talk		InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript SEIP - Software Engineering in Practice Saikat Dutta University of Illinois at Urbana-Champaign, Diego Garbervetsky University of Buenos Aires and CONICET, Argentina, Shuvendu K. Lahiri Microsoft Research, Max Schaefer GitHub, Inc. DOI Pre-print Media Attached
11:15 5m Talk		DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs Technical Track Mohammad Wardat Dept. of Computer Science, Iowa State University, Breno Dantas Cruz Dept. of Computer Science, Iowa State University, Wei Le Iowa State University, Hridesh Rajan Iowa State University Pre-print Media Attached
11:20 5m Talk		Inference and Test Generation Using Program Invariants in Chemical Reaction Networks Technical Track Michael C. Gerten Iowa State University, Alexis L. Marsh Iowa State University, James I. Lathrop Iowa State University, Myra Cohen Iowa State University, Andrew S. Miner Iowa State University, Titus H. Klinge Drake University DOI Pre-print Media Attached
11:25 5m Talk		PUS: A Fast and Highly Efficient Solver for Inclusion-based Pointer AnalysisDistinguished Paper Award Technical Track Peiming Liu Texas A&M University, Yanze Li University of British Columbia, Bradley Swain Texas A&M University, Jeff Huang Texas A&M University Pre-print Media Attached
11:30 5m Talk		Fast and Precise Application Code Analysis using a Partial Library Technical Track Akshay Utture University of California, Los Angeles (UCLA), Jens Palsberg University of California at Los Angeles DOI Pre-print Media Attached

Information for Participants

Mon 9 May 2022 21:00 - 22:00 at ICSE room 5 - Program Analysis 3 Chair(s): Travis Breaux

Info for room ICSE room 5-odd hours:

Click here to go to the room on Midspace

Tue 10 May 2022 13:00 - 14:00 at ICSE room 3 - Program Analysis 4 Chair(s): Miguel Goulao

Info for room ICSE room 3-odd hours:

Click here to go to the room on Midspace