Compiler Fuzzing through Deep Learning
Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested.
We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach \emph{infers} a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers.
We apply our approach to the OpenCL programming language, automatically exposing bugs in OpenCL compilers with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports.
Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03x less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code.
Mon 16 Jul Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:30: Machine LearningISSTA Technical Papers at Zurich II Chair(s): Alex OrsoGeorgia Institute of Technology | |||
16:00 - 16:20 Talk | Compiler Fuzzing through Deep Learning ISSTA Technical Papers Chris CumminsUniversity of Edinburgh, Pavlos PetoumenosUniversity of Edinburgh, Alastair MurrayCodeplay Software, Hugh LeatherUniversity of Edinburgh | ||
16:20 - 16:40 Talk | Deep Specification Mining ISSTA Technical Papers Tien-Duy B. LeSchool of Information Systems, Singapore Management University, David LoSingapore Management University | ||
16:40 - 17:00 Talk | Identifying Implementation Bugs in Machine Learning based Image Classifiers using Metamorphic Testing ISSTA Technical Papers Anurag DwarakanathAccenture Labs, Manish AhujaAccenture Labs, Samarth SikandAccenture Labs, Raghotham M RaoAccenture Labs, R.P. Jagadeesh Chandra BoseAccenture Labs, Neville DubashAccenture Labs, Sanjay Podder | ||
17:00 - 17:20 Talk | An Empirical Study on TensorFlow Program Bugs ISSTA Technical Papers Yuhao ZhangPeking University, Yifan ChenPeking University, Shing-Chi CheungDepartment of Computer Science and Engineering, The Hong Kong University of Science and Technology, Yingfei XiongPeking University, Lu ZhangPeking University Pre-print | ||
17:20 - 17:30 | Q&A in groups ISSTA Technical Papers |