On the acceptance by code reviewers of candidate security patches suggested by Automated Program Repair tools
Background: Testing and validation of the semantic correctness of patches provided by tools for Automated Program Repairs (APR) has received a lot of attention. Yet, the eventual acceptance or rejection of suggested patches for real world projects by humans patch reviewers has received a limited attention.
Objective: To address this issue, we plan to investigate whether (possibly incorrect) security patches suggested by APR tools are recognized by human reviewers. We also wants to investigate whether knowing that a patch was produced by a specialized tool does change the decision of human reviewers.
Method: In a first phase, using a balanced design, we propose to human patch reviewers a combination of patches proposed by different APR tools for different vulnerabilities and ask reviewers to adopt or reject the proposed patches. In a second phase, we notify the participants whether some of the proposed patches were generated by a security tool and measure whether the human reviewers would change their decision to adopt or reject a patch.
Limitations: The experiments will be conducted in an academic setting and to maintain power will focus on a limited sample of popular APR tools and popular vulnerability types.