Re(gEx|DoS)Eval: Evaluating Generated Regular Expressions and their Proneness to DoS Attacks
With the recent development of the large language model-based text and code generation technologies, users are using them for a vast range of tasks, including regex generation. Despite the efforts to generate regexes from natural language, there is no prompt benchmark for LLMs with real-world data and robust test sets. Moreover, a regex can be prone to the Denial of Service (DoS) attacks due to catastrophic backtracking. Hence, we need a systematic evaluation process to evaluate the correctness and security of the regexes generated by the language models. This artifact acompanies our ICSE-NIER paper, in which we describe Re(gEx|DoS)Eval: a framework which includes a dataset of 762 regex descriptions (prompts) from real users, refined prompts with examples, and a robust set of tests. We introduce the pass@k and vulnerable@k metrics to evaluate the generated regexes based on the functional correctness and proneness to ReDoS attacks. More- over, we demonstrate the Re(gEx|DoS)Eval with three language models i.e., T5, Phi-1.5, and GPT-3, and described the plan for the future extension of this framework.