Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models
Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: ac- curacy and explainability. Exploring the trade-off between these objec- tives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation synthesis have been devel- oped, they typically lack formal guarantees on the Pareto-optimality of the results. Methods that do provide such guarantees, on the other hand, often face severe scalability limitations when exploring the Pareto- optimal space. To address this, we develop a framework based on local optimality guarantees that enables more scalable synthesis of interpre- tations. Specifically, we consider the problem of synthesizing a set of Pareto-optimal interpretations with local optimality guarantees, within the immediate neighborhood of each solution. Our approach begins with a multi-objective learning or search technique, such as Multi-Objective Monte Carlo Tree Search, to generate a best-effort set of Pareto-optimal solutions with respect to accuracy and explainability. We then verify local optimality for each candidate as a Boolean satisfiability problem, which we solve using a SAT solver. We demonstrate the efficacy of our approach on a set of benchmarks, comparing it against previous methods for exploring the Pareto-optimal front of interpretations. In particular, we show that our approach yields interpretations that closely match those synthesized by methods offering global guarantees.
Thu 30 OctDisplayed time zone: Chennai, Kolkata, Mumbai, New Delhi change
11:00 - 12:30 | |||
11:00 30mPaper | Inductive Generalization in Reinforcement Learning from Specifications ATVA Papers Vignesh Subramanian Georgia Institute of Technology, Rohit Kushwah , Subhajit Roy IIT Kanpur, Suguman Bansal Georgia Institute of Technology, USA | ||
11:30 30mPaper | Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models ATVA Papers Aniruddha Joshi UC Berkeley, Supratik Chakraborty IIT Bombay, S. Akshay , Shetal Shah IIT Bombay, India, Hazem Torfah Chalmers University of Technology, Sanjit Seshia UC Berkeley | ||
12:00 30mPaper | Solution-aware vs global ReLU selection: partial MILP strikes back for DNN verification ATVA Papers Yuke Liao CNRS@CREATE, Singapore, Blaise Genest IPAL - CNRS - CNRS@CREATE, Kuldeep S. Meel National University of Singapore, Shaan Aryaman NYU | ||