Tue 2 - Wed 3 March 2021 Online Conference
Wed 3 Mar 2021 12:30 - 12:45 at CC Virtual Room - Natural & Source Language Analysis Chair(s): Zhijia Zhao

This paper presents Deepsy, a Natural Language-based synthesizer to assist source code analysis. It takes English descriptions of to-be-found code patterns as its inputs, and automatically produces ASTMatcher expressions that are directly usable by LLVM/Clang to materialize intended code analysis. The code analysis domain features profuse complexities in data types and operations, which make it elusive for prior rule-based synthesizers to tackle. On the other hand, machine learning-based solutions are neither applicable due to the scarcity of well labeled examples. This paper presents how Deepsy addresses the challenges by leveraging deep Natural Language Processing (NLP) and creating a new technique named dependency tree-based co-evolvement. Deepsy features an effective design that seamlessly integrates Natural Language dependency analysis into code analysis and meanwhile synergizes it with type-based narrowing and domain-specific guidance. Deepsy achieves over 70.0% expression-level accuracy and 85.1% individual API-level accuracy, significantly outperforming previous solutions.

Deep NLP-Based Co-evolvement for Synthesizing Code Analysis from Natural Language
Zifan NanNorth Carolina State University, Hui GuanUniversity of Massachusetts at Amherst, Xipeng ShenNorth Carolina State University, Chunhua LiaoLawrence Livermore National Laboratory
