Efficiently and Precisely Searching for Code Changes with DiffSearch
Thu 26 May 2022 13:30 - 15:00 at Ballroom Gallery - Posters 2
Version histories of code contain a lot of useful information and these data are public, thanks to open source software. However, searching through large repository histories can be complex, because there is no specific tool to search for code changes. This paper presents DiffSearch, the first efficient and scalable search engine for code changes. Given a list of repositories and a query, DiffSearch can retrieve specific code changes in a few seconds. We design a language-agnostic approach that we test on three popular programming languages: Java, JavaScript, and Python, and we design a query language that is an extension of the supported languages. We evaluate DiffSearch in three steps. First, we measure a recall of 81.8%, 89.6%, and 90,4% for Java, Python, and JavaScript, respectively, and an average response time lower than five seconds. Second, we demonstrate its scalability with a large dataset of one million code changes. Last, we perform a case study to show one of the possible applications of our tool, where DiffSearch gathers a dataset of 74,903 Java bug fixes.
Poster (LucaDiGrazia - Diffsearch - SRC ICSE 2022.pdf) | 1.33MiB |
I am a PhD student at University of Stuttgart, part of TU9 (the alliance of leading Technical Universities in Germany), advised by Prof. Dr. Michael Pradel. While studying for my Master Degree at Polytechnic of Turin I developed a huge interest in classification and Machine Learning. I studied Computer engineering from hardware level, designing in VHDL a microprocessor, to high level software programming, like Deep Learning, applied on Visual Recognition using Convolutional Neural Networks with Python and Matlab. My current research interest is on Code Evolution, Mining Repositories, Software Analysis, and Machine Learning.
Tue 24 MayDisplayed time zone: Eastern Time (US & Canada) change
13:00 - 15:00 | Poster round: GraduatesSRC - ACM Student Research Competition at Student Research Competition room Judges
| ||
14:00 2h | Woodpecker: Identifying and Fixing Android UI Display Issues SRC - ACM Student Research Competition Zhe Liu Institute of Software, Chinese Academy of Sciences | ||
14:00 2h | Static Test Flakiness Prediction SRC - ACM Student Research Competition Valeria Pontillo University of Salerno | ||
14:00 2h | Finding Appropriate User Feedback Analysis Techniques for Multiple Data Domains SRC - ACM Student Research Competition Peter Devine The University of Auckland | ||
14:00 2hShort-paper | Efficiently and Precisely Searching for Code Changes with DiffSearch SRC - ACM Student Research Competition Luca Di Grazia University of Stuttgart Link to publication DOI File Attached | ||
14:00 2h | An Empirical Study on the Current Adoption of Quantum Programming SRC - ACM Student Research Competition Manuel De Stefano Università di Salerno |
Thu 26 MayDisplayed time zone: Eastern Time (US & Canada) change
13:30 - 15:00 | |||
13:30 90mTalk | "Did You Miss My Comment or What?" Understanding Toxicity in Open Source DiscussionsDistinguished Paper Award Technical Track Courtney Miller Carnegie Mellon University, Sophie Cohen Wesleyan University, Daniel Klug Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, USA, Christian Kästner Carnegie Mellon University Pre-print Media Attached | ||
13:30 90mTalk | On Debugging the Performance of Configurable Software Systems: Developer Needs and Tailored Tool Support Technical Track Miguel Velez Carnegie Mellon University, Pooyan Jamshidi University of South Carolina, Norbert Siegmund Leipzig University, Sven Apel Saarland University, Christian Kästner Carnegie Mellon University Pre-print Media Attached | ||
13:30 90m | Let's Talk Open-Source - An Analysis of Conference Talks and Community Dynamics SRC - ACM Student Research Competition Kimberly Truong Oregon State University | ||
13:30 90mTalk | The Case for Adaptive Security Interventions Journal-First Papers Irum Rauf The Open University, UK, Marian Petre The Open University, Thein Tun , Tamara Lopez The Open University, Paul Lunn The University of Manchester, UK, Dirk van der Linden Northumbria University, John Towse Department of Psychology, University of Lancaster, UK, Helen Sharp The Open University, Mark Levine Lancaster University, Awais Rashid University of Bristol, UK, Bashar Nuseibeh The Open University (UK) & Lero (Ireland) Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | TaintBench: Automatic Real-World Malware Benchmarking of Android Taint Analyses Journal-First Papers Linghui Luo Amazon Web Services, Felix Pauck Paderborn University, Germany, Goran Piskachev Fraunhofer IEM, Manuel Benz Paderborn University, Ivan Pashchenko University of Trento, Martin Mory Paderborn University, Eric Bodden , Ben Hermann Technical University Dortmund, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam Link to publication DOI Pre-print Media Attached File Attached | ||
13:30 90mTalk | Change Is the Only Constant: Dynamic Updates for WorkflowsBest Artifact Award Technical Track Daniel Sokolowski University of St. Gallen, Pascal Weisenburger University of St. Gallen, Guido Salvaneschi University of St. Gallen DOI Pre-print Media Attached | ||
13:30 90mTalk | FeatCompare: Feature Comparison for Competing Mobile Apps Leveraging User Reviews Journal-First Papers Maram Assi Queen's University, Safwat Hassan Thompson Rivers University, Yuan Tian Queens University, Kingston, Canada, Ying Zou Queen's University, Kingston, Ontario Link to publication Pre-print Media Attached | ||
13:30 90mTalk | Scratch as Social Network: Topic Modeling and Sentiment Analysis in Scratch Projects SEIS - Software Engineering in Society Pre-print Media Attached | ||
13:30 90mTalk | Deep Learning based Vulnerability Detection: Are We There Yet? Journal-First Papers Saikat Chakraborty Columbia University, Rahul Krishna IBM Research, Yangruibo Ding Columbia University, Baishakhi Ray Columbia University Link to publication DOI Media Attached | ||
13:30 90mTalk | Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for PythonNominated for Distinguished Paper Technical Track Yun Peng The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Zongjie Li The Hong Kong University of Science and Technology, Bowei Gao Harbin Institute of Technology, Shenzhen, David Lo Singapore Management University, Qirun Zhang Georgia Institute of Technology, USA, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached | ||
13:30 90mTalk | Preempting Flaky Tests via Non-Idempotent-Outcome Tests Technical Track Anjiang Wei Stanford University, Pu Yi Peking University, Zhengxi Li University of Illinois Urbana-Champaign, Tao Xie Peking University, Darko Marinov University of Illinois at Urbana-Champaign, Wing Lam University of Illinois at Urbana-Champaign Pre-print Media Attached | ||
13:30 90mTalk | A Tale of Two Cities: Software Developers Working from Home During the COVID-19 Pandemic Journal-First Papers Denae Ford Microsoft Research, Margaret-Anne Storey University of Victoria, Thomas Zimmermann Microsoft Research, Christian Bird Microsoft Research, Sonia Jaffe Microsoft, Chandra Sekhar Maddila Microsoft Research, Jenna L. Butler Microsoft Research, Brian Houck Microsoft Research, Nachiappan Nagappan Microsoft Research Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | A Grounded Theory Based Approach to Characterize Software Attack Surfaces Technical Track sara moshtari Rochester Institute of Technology, Ahmet Okutan Rochester Institute of Technology, Mehdi Mirakhorli Rochester Institute of Technology Pre-print Media Attached | ||
13:30 90mTalk | Out of Sight, Out of Mind? How Vulnerable Dependencies Affect Open-Source Projects Journal-First Papers Gede Artha Azriadi Prana Singapore Management University, Abhishek Sharma Veracode, Inc., Lwin Khin Shar Singapore Management University, Darius Foo National University of Singapore, Andrew Santosa Veracode, Inc., Asankhaya Sharma Veracode, Inc., David Lo Singapore Management University Pre-print Media Attached | ||
13:30 90mTalk | Towards Property-Based Tests in Natural Language NIER - New Ideas and Emerging Results Colin Gordon Drexel University Pre-print Media Attached | ||
13:30 90mTalk | How Templated Requirements Specifications Inhibit Creativity in Software Engineering Journal-First Papers Rahul Mohanani University of Jyväskylä, Paul Ralph Dalhousie University, Burak Turhan University of Oulu, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | Using Reinforcement Learning for Load Testing of Video Games Technical Track Rosalia Tufano Università della Svizzera Italiana, Simone Scalabrino University of Molise, Luca Pascarella Università della Svizzera italiana (USI), Emad Aghajani Software Institute, USI Università della Svizzera italiana, Rocco Oliveto University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
13:30 90mTalk | Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source Technical Track Anjiang Wei Stanford University, Yinlin Deng University of Illinois at Urbana-Champaign, Chenyuan Yang Nanjing University, Lingming Zhang University of Illinois at Urbana-Champaign Pre-print Media Attached | ||
13:30 90mTalk | Trust Enhancement Issues in Program Repair Technical Track Yannic Noller National University of Singapore, Ridwan Salihin Shariffdeen National University of Singapore, Xiang Gao Beihang University, China, Abhik Roychoudhury National University of Singapore Pre-print Media Attached | ||
13:30 90mTalk | An Empirical Study on Release Notes Patterns of Popular Apps in the Google Play Store Journal-First Papers Aidan Z.H. Yang Carnegie Mellon University, Safwat Hassan Thompson Rivers University, Ying Zou Queen's University, Kingston, Ontario, Ahmed E. Hassan Queen's University Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | Learning Lenient Parsing & Typing via Indirect Supervision Journal-First Papers Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis, Vincent J. Hellendoorn Carnegie Mellon University Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | CONFETTI: Amplifying Concolic Guidance for Fuzzers Technical Track James Kukucka George Mason University, Luís Pina University of Illinois at Chicago, Paul Ammann George Mason University, USA, Jonathan Bell Northeastern University Pre-print Media Attached | ||
13:30 90mTalk | Natural Attack for Pre-trained Models of Code Technical Track Zhou Yang Singapore Management University, Jieke Shi Singapore Management University, Junda He Singapore Management University, David Lo Singapore Management University DOI Pre-print Media Attached | ||
13:30 90mTalk | A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits Journal-First Papers Steffen Herbold TU Clausthal, Alexander Trautsch University of Göttingen, Benjamin Ledel TU Clausthal, Alireza Aghamohammadi Sharif University of Technology, Taher A Ghaleb University of Ottawa, Kuljit Kaur Chahal Guru Nanak Dev University, Tim Bossenmaier Karlsruhe Institute of Technology (KIT), Bhaveet Nagaria Brunel University London, Philip Makedonski University of Goettingen, Matin Nili Ahmadabadi University of Tehran, Kristof Szabados Ericsson Hungary ltd., Helge Spieker Simula Research Laboratory, Norway, Matej Madeja Technical University of Košice, Nathaniel G. Hoy Brunel University London, Valentina Lenarduzzi University of Oulu, Shangwen Wang National University of Defense Technology, Gema Rodríguez-Pérez University of British Columbia (UBC), Ricardo Colomo-Palacios Østfold University College, Roberto Verdecchia Vrije Universiteit Amsterdam, Paramvir Singh The University of Auckland, Yihao Qin , Debasish Chakroborti University of Saskatchewan, Willard Davis IBM, Vijay Walunj University of Missouri-Kansas City, Hongjun Wu National University of Defense Technology, Diego Marcilio USI Università della Svizzera italiana, Omar Alam Trent University, Abdullah Aldaeej Imam Abdulrahman Bin Faisal University, Idan Amit The Hebrew University, Burak Turhan University of Oulu, Simon Eismann University of Würzburg, Anna-Katharina Wickert TU Darmstadt, Germany, Ivano Malavolta Vrije Universiteit Amsterdam, Matúš Sulír Technical University of Košice, Fatemeh Hendijani Fard University of British Columbia, Austin Henley University of Tennessee, Efstratios Kourtzanidis University Of Macedonia, Eray Tüzün Bilkent University, Christoph Treude University of Melbourne, Simin Maleki Shamasbi Indendent Researcher, Ivan Pashchenko University of Trento, Marvin Wyrich University of Stuttgart, James C. Davis Purdue University, USA, Alexander Serebrenik Eindhoven University of Technology, Ella Albrecht University of Goettingen, Ethem Utku Aktas Softtech Inc., Daniel Strüber Chalmers | University of Gothenburg / Radboud University, Johannes Erbel University of Goettingen Pre-print Media Attached | ||
13:30 90mTalk | A Family of Experiments on Test-Driven Development Journal-First Papers Adrian Santos Parrilla University of Oulu, Sira Vegas Universidad Politecnica de Madrid, Oscar Dieste Universidad Politécnica de Madrid, Fernando Uyaguari ETAPA Telecommunications Company, Ayse Tosun Istanbul Technical University, Davide Fucci Blekinge Institute of Technology, Burak Turhan University of Oulu, Giuseppe Scanniello University of Basilicata, Simone Romano University of Bari, Itir Karac University of Oulu, Marco Kuhrmann Reutlingen University, Vladimir Mandić Faculty of Technical Sciences, University of Novi Sad, Robert Ramač Faculty of Technical Sciences, University of Novi Sad, Dietmar Pfahl University of Tartu, Christian Engblom Ericsson, Jarno Kyykka Ericsson, Kerli Rungi Testlio, Carolina Palomeque ETAPA Telecommunications Company, Jaroslav Spisak PAF, Markku Oivo University of Oulu, Natalia Juristo Universidad Politecnica de Madrid Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | SugarC: Scalable Desugaring of Real-World Preprocessor Usage into Pure C Technical Track Zachary Patterson University of Texas at Dallas, Zenong Zhang The University of Texas at Dallas, Brent Pappas University of Central Florida, Shiyi Wei University of Texas at Dallas, Paul Gazzillo University of Central Florida Pre-print Media Attached | ||
13:30 90mTalk | Within-project Defect Prediction of Infrastructure-as-Code Using Product and Process Metrics Journal-First Papers Stefano Dalla Palma Tilburg University, Dario Di Nucci University of Salerno, Fabio Palomba University of Salerno, Damian Andrew Tamburri TU/e Link to publication DOI Authorizer link Pre-print Media Attached | ||
13:30 90mPoster | Enabling End-Users to Implement Larger Block-Based Programs Posters Nico Ritschel The University of British Columbia, Felipe Fronchetti Virginia Commonwealth University, Reid Holmes University of British Columbia, Ronald Garcia University of British Columbia, David C. Shepherd Virginia Commonwealth University | ||
13:30 90mTalk | FADATest: Fast and Adaptive Performance Regression Testing of Dynamic Binary Translation Systems Technical Track Jin Wu Harbin Institute of Technology, Jian Dong Harbin Institute Of Technology, Ruili Fang University of Georgia, Wen Zhang University of Georgia, Wenwen Wang University of Georgia, Decheng Zuo Harbin Institute of Technology Pre-print Media Attached | ||
13:30 90mTalk | PUS: A Fast and Highly Efficient Solver for Inclusion-based Pointer AnalysisDistinguished Paper Award Technical Track Peiming Liu Texas A&M University, Yanze Li University of British Columbia, Bradley Swain Texas A&M University, Jeff Huang Texas A&M University Pre-print Media Attached | ||
13:30 90mTalk | Adaptive Performance Anomaly Detection for Online Service Systems via Pattern Sketching Technical Track Zhuangbin Chen Chinese University of Hong Kong, China, Jinyang Liu , Yuxin Su Sun Yat-sen University, Hongyu Zhang University of Newcastle, Xiao Ling Huawei Technologies, Yongqiang Yang Huawei Technologies, Michael Lyu The Chinese University of Hong Kong Pre-print Media Attached | ||
13:30 90mTalk | Rotten Apples Spoil the Bunch: An Anatomy of Google Play Malware Technical Track Michael Cao University of British Columbia, Khaled Ahmed University of British Columbia (UBC), Julia Rubin University of British Columbia Pre-print Media Attached | ||
13:30 90mShort-paper | Efficiently and Precisely Searching for Code Changes with DiffSearch SRC - ACM Student Research Competition Luca Di Grazia University of Stuttgart Link to publication DOI File Attached |