Abstract
Simple, sufficient explanations furnished by short decision lists can be useful for guiding stakeholder actions. Unfortunately, this transparency can come at the expense of the higher accuracy enjoyed by black box methods, like deep nets. To date, practitioners typically either (i) insist on the simpler model, forsaking accuracy; or (ii) insist on maximizing accuracy, settling for post-hoc explanations of dubious faithfulness. In this paper, we propose a hybrid partially interpretable model that represents a compromise between the two extremes. In our setup, each input is first processed by a decision list that can either execute a decision or abstain, handing off authority to the opaque model. The key to optimizing the decision list is to optimally trade off the accuracy of the composite system against coverage (the fraction of the population that receives explanations). We contribute a new principled algorithm for constructing partially interpretable decision lists, providing theoretical guarantees addressing both interpretability and accuracy. As an instance of our result, we prove that when the optimal decision list has length k, coverage c, and b mistakes, our algorithm will generate a decision list that has length no greater than 4k, coverage at least c/2, and makes at most 4b mistakes. Finally, we empirically validate the effectiveness of the new model.
Original language | English |
---|---|
Pages (from-to) | 590-613 |
Number of pages | 24 |
Journal | Proceedings of Machine Learning Research |
Volume | 237 |
State | Published - 2024 |
Event | 35th International Conference on Algorithmic Learning Theory, ALT 2024 - La Jolla, United States Duration: 25 Feb 2024 → 28 Feb 2024 |
Funding
Funders | Funder number |
---|---|
Yandex Initiative for Machine Learning | |
CMU Software Engineering Institute | |
Center for Machine Learning and Health, School of Computer Science, Carnegie Mellon University | |
Israel Science Foundation | |
Medical Center, University of Pittsburgh | |
Stockholm Environment Institute | |
Simons Institute for the Theory of Computing, University of California Berkeley | |
Tel Aviv University | |
European Commission | |
ACMI | |
National Science Foundation | FAI 2040929, IIS2211955 |
U.S. Department of Defense | FA8702-15-D-0002 |
Horizon 2020 Framework Programme | 882396 |
Keywords
- Decision List
- Interpretability
- Partially Interpretable Models