One Automaton to Rule Them All: Beyond Multiple Regular Expressions Execution
Regular Expressions (REs) matching is crucial to identify strings exhibiting certain morphological properties in a data stream, resulting paramount in contexts such as deep packet inspection in computer security and genome analysis in bioinformatics. Yet, due to their intrinsic data-dependence characteristics, REs represent a complex computational kernel, and numerous solutions investigate pattern-matching efficiency in different directions. However, most of them lack a comprehensive ruleset optimization approach to truly push the pattern matching performance when considering multiple REs together. Thus, exploiting REs morphological similarities within the same dataset allows memory reduction when storing the patterns and drastically improves the dataset-matching throughput. Based on this observation, we propose the Multi-RE Finite State Automata (MFSA) that extends the Finite State Automata (FSA) model to improve REs parallelization by leveraging similarities within a specific application ruleset. We design a multi-level compilation framework to manage REs merging and optimization to produce MFSA(s). Furthermore, we extend iNFAnt algorithm for MFSAs execution with the novel iMFAnt engine. Our evaluation investigates the MFSA size-reduction impact and the execution throughput compared with the one of multiple FSA in both single- and multi-threaded configurations. This approach shows an average 71.95% compression in terms of states, introducing limited compilation time overhead. Besides, best iMFAnt achieves a geomean 5.99× throughput improvement and 4.05× speedup against single and multiple parallel FSAs.
Tue 5 MarDisplayed time zone: London change
10:00 - 11:00 | |||
10:00 20mTalk | One Automaton to Rule Them All: Beyond Multiple Regular Expressions Execution Main Conference Luisa Cicolini Politecnico di Milano, Filippo Carloni Politecnico di Milano, Marco D. Santambrogio Politecnico di Milano, Davide Conficconi Politecnico di Milano Pre-print Media Attached | ||
10:20 20mTalk | Whose Baseline Compiler Is It Anyway? Main Conference Ben L. Titzer Carnegie Mellon University Pre-print | ||
10:40 20mTalk | Enabling Fine-Grained Incremental Builds by Making Compiler Stateful Main Conference Ruobing Han Georgia Institute of Technology, Jisheng Zhao Georgia Institute of Technology, Hyesoon Kim Georgia Institute of Technology Pre-print |