InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript
Tue 10 May 2022 13:10 - 13:15 at ICSE room 3-odd hours - Program Analysis 4 Chair(s): Miguel Goulao
Thu 26 May 2022 11:10 - 11:15 at Room 306+307 - Papers 14: Program Analysis Chair(s): Frank Tip
Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the taint-tracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities.
But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort. Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learning-based approaches have shown great promise.
In this work we present our experience combining an existing machine-learning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub’s CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives.
Mon 9 MayDisplayed time zone: Eastern Time (US & Canada) change
21:00 - 22:00 | Program Analysis 3Technical Track / SEIP - Software Engineering in Practice / Journal-First Papers at ICSE room 5-odd hours Chair(s): Travis Breaux Carnegie Mellon University | ||
21:00 5mTalk | Learning to Find Usages of Library Functions in Optimized Binaries Journal-First Papers Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis, Anand Ashok Sawant University of California, Davis Link to publication DOI Pre-print Media Attached | ||
21:05 5mTalk | InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript SEIP - Software Engineering in Practice Saikat Dutta University of Illinois at Urbana-Champaign, Diego Garbervetsky University of Buenos Aires and CONICET, Argentina, Shuvendu K. Lahiri Microsoft Research, Max Schaefer GitHub, Inc. DOI Pre-print Media Attached | ||
21:10 5mTalk | Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for PythonNominated for Distinguished Paper Technical Track Yun Peng The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Zongjie Li The Hong Kong University of Science and Technology, Bowei Gao Harbin Institute of Technology, Shenzhen, David Lo Singapore Management University, Qirun Zhang Georgia Institute of Technology, USA, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached | ||
21:15 5mTalk | DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs Technical Track Mohammad Wardat Dept. of Computer Science, Iowa State University, Breno Dantas Cruz Dept. of Computer Science, Iowa State University, Wei Le Iowa State University, Hridesh Rajan Iowa State University Pre-print Media Attached | ||
21:20 5mTalk | Striking a Balance: Pruning False-Positives from Static Call GraphsNominated for Distinguished Paper Technical Track Akshay Utture University of California, Los Angeles (UCLA), Shuyang Liu University of California, Los Angeles, Christian Gram Kalhauge Technical University of Denmark, Jens Palsberg University of California at Los Angeles DOI Pre-print Media Attached |
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
Thu 26 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | Papers 14: Program AnalysisTechnical Track / SEIP - Software Engineering in Practice / Journal-First Papers at Room 306+307 Chair(s): Frank Tip Northeastern University | ||
11:00 5mTalk | Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for PythonNominated for Distinguished Paper Technical Track Yun Peng The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Zongjie Li The Hong Kong University of Science and Technology, Bowei Gao Harbin Institute of Technology, Shenzhen, David Lo Singapore Management University, Qirun Zhang Georgia Institute of Technology, USA, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached | ||
11:05 5mTalk | TaintBench: Automatic Real-World Malware Benchmarking of Android Taint Analyses Journal-First Papers Linghui Luo Amazon Web Services, Felix Pauck Paderborn University, Germany, Goran Piskachev Fraunhofer IEM, Manuel Benz Paderborn University, Ivan Pashchenko University of Trento, Martin Mory Paderborn University, Eric Bodden , Ben Hermann Technical University Dortmund, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam Link to publication DOI Pre-print Media Attached File Attached | ||
11:10 5mTalk | InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript SEIP - Software Engineering in Practice Saikat Dutta University of Illinois at Urbana-Champaign, Diego Garbervetsky University of Buenos Aires and CONICET, Argentina, Shuvendu K. Lahiri Microsoft Research, Max Schaefer GitHub, Inc. DOI Pre-print Media Attached | ||
11:15 5mTalk | DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs Technical Track Mohammad Wardat Dept. of Computer Science, Iowa State University, Breno Dantas Cruz Dept. of Computer Science, Iowa State University, Wei Le Iowa State University, Hridesh Rajan Iowa State University Pre-print Media Attached | ||
11:20 5mTalk | Inference and Test Generation Using Program Invariants in Chemical Reaction Networks Technical Track Michael C. Gerten Iowa State University, Alexis L. Marsh Iowa State University, James I. Lathrop Iowa State University, Myra Cohen Iowa State University, Andrew S. Miner Iowa State University, Titus H. Klinge Drake University DOI Pre-print Media Attached | ||
11:25 5mTalk | PUS: A Fast and Highly Efficient Solver for Inclusion-based Pointer AnalysisDistinguished Paper Award Technical Track Peiming Liu Texas A&M University, Yanze Li University of British Columbia, Bradley Swain Texas A&M University, Jeff Huang Texas A&M University Pre-print Media Attached | ||
11:30 5mTalk | Fast and Precise Application Code Analysis using a Partial Library Technical Track Akshay Utture University of California, Los Angeles (UCLA), Jens Palsberg University of California at Los Angeles DOI Pre-print Media Attached |