Framework for Evaluating Faithfulness of Local Explanations

Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz

Research output: Contribution to journalConference articlepeer-review

Abstract

We study the faithfulness of an explanation system to the underlying prediction model. We show that this can be captured by two properties, consistency and sufficiency, and introduce quantitative measures of the extent to which these hold. Interestingly, these measures depend on the test-time data distribution. For a variety of existing explanation systems, such as anchors, we analytically study these quantities. We also provide estimators and sample complexity bounds for empirically determining the faithfulness of black-box explanation systems. Finally, we experimentally validate the new properties and estimators.

Original languageEnglish
Pages (from-to)4794-4815
Number of pages22
JournalProceedings of Machine Learning Research
Volume162
StatePublished - 2022
Event39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States
Duration: 17 Jul 202223 Jul 2022

Fingerprint

Dive into the research topics of 'Framework for Evaluating Faithfulness of Local Explanations'. Together they form a unique fingerprint.

Cite this