Poster: Comprehensive Comparisons of Embedding Approaches for Cryptographic API Completion (ICSE 2022 - Posters)

Who

Ya Xiao, Salman Ahmed, Xinyang Ge, Bimal Viswanath, Na Meng, Daphne Yao

Track

ICSE 2022 Posters

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 12 May 2022 11:10 - 11:15 at ICSE Poster room - Poster Session 3 Chair(s): Jin L.C. Guo
Fri 27 May 2022 13:30 - 15:00 at Ballroom Gallery - Posters 3

Abstract

In this paper, we conduct a measurement study to comprehensively compare the accuracy of Cryptographic API completion tasks trained with multiple API embedding options. Embedding is the process of automatically learning to represent program elements as low-dimensional vectors. Our measurement aims to uncover the impacts of applying program analysis, token-level embedding, and sequence-level embedding on the Cryptographic API completion accuracies. Our findings show that program analysis is necessary even under advanced embedding. The results show 36.10% accuracy improvement on average when program analysis preprocessing is applied to transfer byte code sequences into API dependence paths. The best accuracy (93.52%) is achieved on API dependence paths with embedding techniques. On the contrary, the pure data-driven approach without program analysis only achieves a low accuracy (around 57.60%), even after the powerful sequence-level embedding is applied. Although sequence-level embedding shows slight accuracy advantages (0.55% on average) over token-level embedding in our basic data split setting, it is not recommended considering its expensive training cost. A more obvious accuracy improvement (5.10%) from sequence-level embedding is observed under the cross-project learning scenario when task data is insufficient. Hence, we recommend applying sequence-level embedding for cross-project learning with limited task-specific data.

Ya Xiao

Virginia Tech

United States

Salman Ahmed

Virginia Polytechnic Institute and State University

Xinyang Ge

Microsoft Research

Bimal Viswanath

Virginia Tech

Na Meng

Virginia Tech

United States

Daphne Yao

Virginia Tech

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 12 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:00	Poster Session 3Posters at ICSE Poster room Chair(s): Jin L.C. Guo McGill University

11:00 5m Poster		Enabling End-Users to Implement Larger Block-Based Programs Posters Nico Ritschel The University of British Columbia, Felipe Fronchetti Virginia Commonwealth University, Reid Holmes University of British Columbia, Ronald Garcia University of British Columbia, David C. Shepherd Virginia Commonwealth University
11:05 5m Poster		Mutation Testing of Quantum Programs written in QISKit Posters Daniel Fortunato INESC-ID, University of Porto, José Campos University of Lisbon, Portugal, Rui Abreu Faculty of Engineering, University of Porto, Portugal
11:10 5m Poster		Poster: Comprehensive Comparisons of Embedding Approaches for Cryptographic API Completion Posters Ya Xiao Virginia Tech, Salman Ahmed Virginia Polytechnic Institute and State University, Xinyang Ge Microsoft Research, Bimal Viswanath Virginia Tech, Na Meng Virginia Tech, Daphne Yao Virginia Tech
11:15 5m Poster		Improving Responsiveness of Android Activity Navigation via Genetic Improvement Posters James Callan UCL, Justyna Petke University College London
11:20 5m Poster		A Quick Repair Facility for Debugging Posters Steven P. Reiss Brown University, USA, Qi Xin Brown University, USA
11:25 5m Poster		Flexible Model-Driven Runtime Monitoring Support for Cyber-Physical Systems Posters Marco Stadler Johannes Kepler University Linz, Michael Vierhauser Johannes Kepler University Linz, Antonio Garmendia Johannes Kepler University Linz, Manuel Wimmer JKU Linz, Jane Cleland-Huang University of Notre Dame Pre-print

Fri 27 May
Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:00	Posters 3NIER - New Ideas and Emerging Results / Technical Track / Journal-First Papers / SEIS - Software Engineering in Society / Posters / SRC - ACM Student Research Competition / SEIP - Software Engineering in Practice / DEMO - Demonstrations at Ballroom Gallery

13:30 90m Talk		Investigating User Perceptions of Conversational Agents for Software-related Exploratory Web Search NIER - New Ideas and Emerging Results Matthew Frazier University of Delaware, Shaayal Kumar University of Delaware, Kostadin Damevski Virginia Commonwealth University, Lori Pollock University of Delaware DOI Pre-print Media Attached
13:30 90m Talk		Bots for Pull Requests: The Good, the Bad, and the Promising Technical Track Mairieli Wessel Delft University of Technology, Ahmad Abdellatif Concordia University, Igor Wiese Federal University of Technology - Paraná (UTFPR), Tayana Conte Universidade Federal do Amazonas, Emad Shihab Concordia University, Marco Gerosa Northern Arizona University, USA, Igor Steinmacher Northern Arizona University Pre-print
13:30 90m Talk		Post2Vec: Learning Distributed Representations of Stack Overflow Posts Journal-First Papers Bowen Xu Singapore Management University, Thong Hoang Singapore Management University, Singapore, Abhishek Sharma Veracode, Inc., Chengran Yang Singapore Management University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University Link to publication DOI Pre-print
13:30 90m Talk		Detecting Interpersonal Conflict in Issues and Code Review: Cross Pollinating Open- and Closed-Source Approaches SEIS - Software Engineering in Society Huilian Sophie Qiu Carnegie Mellon University, USA, Bogdan Vasilescu Carnegie Mellon University, USA, Christian Kästner Carnegie Mellon University, Carolyn Egelman Google, Ciera Jaspan Google, Emerson Murphy-Hill Google Pre-print Media Attached
13:30 90m Poster		Poster: Comprehensive Comparisons of Embedding Approaches for Cryptographic API Completion Posters Ya Xiao Virginia Tech, Salman Ahmed Virginia Polytechnic Institute and State University, Xinyang Ge Microsoft Research, Bimal Viswanath Virginia Tech, Na Meng Virginia Tech, Daphne Yao Virginia Tech
13:30 90m Talk		Semantic Image Fuzzing of AI Perception Systems Technical Track Trey Woodlief University of Virginia, Sebastian Elbaum University of Virginia, Kevin Sullivan University of Virginia DOI Pre-print Media Attached
13:30 90m		To Disengage or Not to Disengage: A Look at Contributor Disengagement in Open Source Software SRC - ACM Student Research Competition Philip Gray New College of Florida
13:30 90m Talk		Hashing It Out: A Survey of Programmers’ Cannabis Usage, Perception, and Motivation Technical Track Madeline Endres University of Michigan, Kevin Boehnke University of Michigan, Westley Weimer University of Michigan DOI Pre-print Media Attached
13:30 90m Talk		Bus Factor In Practice SEIP - Software Engineering in Practice Elgun Jabrayilzade Bilkent University, Mikhail Evtikhiev JetBrains Research, Eray Tüzün Bilkent University, Vladimir Kovalenko JetBrains Research Pre-print Media Attached
13:30 90m Talk		Garbage Collection Makes Rust Easier to Use: A Randomized Controlled Trial of the Bronze Garbage CollectorNominated for Distinguished Paper Technical Track Michael Coblenz University of Maryland at College Park, Michelle Mazurek University of Maryland, Michael Hicks University of Maryland at College Park DOI Pre-print Media Attached
13:30 90m Talk		Learning and Programming Challenges of Rust: A Mixed-Methods Study Technical Track Shuofei Zhu The Pennsylvania State University, Ziyi Zhang University of Wisconsin–Madison, Boqin Qin China Telecom Cloud Computing Corporation, Aiping Xiong The Pennsylvania State University, Linhai Song Pennsylvania State University, USA DOI Pre-print Media Attached
13:30 90m Talk		Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning NIER - New Ideas and Emerging Results Martin Weyssow DIRO, Université de Montréal, Houari Sahraoui Université de Montréal, Bang Liu DIRO & Mila, Université de Montréal Pre-print Media Attached
13:30 90m Talk		Defect Reduction Planning (using TimeLIME) Journal-First Papers Kewen Peng North Carolina State University, Tim Menzies North Carolina State University Authorizer link Pre-print Media Attached
13:30 90m Demonstration		Gamekins: Gamifying Software Testing in Jenkins DEMO - Demonstrations Philipp Straubinger University of Passau, Gordon Fraser University of Passau DOI Pre-print Media Attached
13:30 90m Talk		How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics in Stack Overflow Journal-First Papers Anthony Peruma Rochester Institute of Technology, Steven Simmons Rochester Institute of Technology, Eman Abdullah AlOmar Stevens Institute of Technology, Christian D. Newman Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec Link to publication DOI Pre-print Media Attached
13:30 90m Talk		Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection Journal-First Papers Nadia Daoudi SnT, University of Luxembourg, Kevin Allix University of Luxembourg, Tegawendé F. Bissyandé SnT, University of Luxembourg, Jacques Klein University of Luxembourg Link to publication Pre-print Media Attached
13:30 90m		Mu2: Using Mutation Analysis to Guide Mutation-Based Fuzzing SRC - ACM Student Research Competition Isabella Laybourn Carnegie Mellon Silicon Valley
13:30 90m Talk		Emotions and Perceived Productivity of Software Developers at the Workplace Journal-First Papers Daniela Girardi University of Bari, Filippo Lanubile University of Bari, Nicole Novielli University of Bari, Alexander Serebrenik Eindhoven University of Technology Link to publication DOI Pre-print Media Attached
13:30 90m Poster		CRustS: A Transpiler from Unsafe C to Safer Rust Posters Michael Ling Huawei Technologies Canada, Yijun Yu The Open University, UK, Haitao Wu Huawei Technologies Canada, Yuan Wang Huawei Sweden Research Center, James R. Cordy Queen's University, Ahmed E. Hassan Queen's University
13:30 90m Talk		Multilingual training for Software Engineering Technical Track Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis DOI Pre-print Media Attached
13:30 90m Talk		An Empirical Investigation on the Challenges Faced by Women in the Software Industry: A Case StudySEIS-track Award SEIS - Software Engineering in Society Bianca Trinkenreich Northern of Arizona Univeristy, Ricardo Britto Ericsson / Blekinge Institute of Technology, Marco Gerosa Northern Arizona University, USA, Igor Steinmacher Northern Arizona University Pre-print Media Attached
13:30 90m Talk		Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached
13:30 90m Talk		Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and ProcessDistinguished Paper Award Technical Track Nadia Nahar Carnegie Mellon University, Shurui Zhou University of Toronto, Grace Lewis Carnegie Mellon Software Engineering Institute, Christian Kästner Carnegie Mellon University Pre-print Media Attached
13:30 90m Talk		Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USA DOI Pre-print Media Attached
13:30 90m Talk		Towards Mining OSS Skills from GitHub Activity NIER - New Ideas and Emerging Results Jenny T. Liang University of Washington, Thomas Zimmermann Microsoft Research, Denae Ford Microsoft Research DOI Pre-print Media Attached
13:30 90m Talk		EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached
13:30 90m Talk		"Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, Netherlands Pre-print Media Attached
13:30 90m Poster		Improving Responsiveness of Android Activity Navigation via Genetic Improvement Posters James Callan UCL, Justyna Petke University College London

Information for Participants

Thu 12 May 2022 11:00 - 12:00 at ICSE Poster room - Poster Session 3 Chair(s): Jin L.C. Guo

Info for room ICSE Poster room:

Click here to go to the room on Midspace