Optimizing Counterfactual-based Analysis of Machine Learning Models Through Databases

Aviv Ben Arie, Daniel Deutch, Nave Frost, Yair Horesh, Idan Meyuhas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the context of Machine Learning models, counterfactuals (CFs) are hypothetical perturbations to a given input of the model that would result in a different classification outcome. Multiple lines of recent work have proposed algorithms for finding CFs (hereby referred to as CF Generators) and demonstrated their value in providing insights for model owners. However, obtaining these insights may be computationally expensive, often requiring many invocations of these algorithms with complex constraints. In this work, we complement these efforts by presenting CFDB: a relational, declarative framework for CF-based analysis. Users of CFDB specify analysis tasks as declarative queries over a relational schema tailored for CFs. CFDB then compiles the specification into a series of CF requests, to be fed as input to CF Generators. The main advantage of this approach is that it allows to optimize the tradeoff between CF generation time and quality. Specifically, our optimizations are based on the observation that often, one may satisfy multiple CF requests using the same CFs, thereby reducing the total number of costly CF Generator invocations. We design algorithms that identify when such reuse is possible and optimize the computation accordingly. We experimentally demonstrate the usefulness of our approach and our optimizations, in the context of multiple datasets, multiple previously proposed Counterfactual Generators, and use cases such as assessing model fairness.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT
PublisherOpenProceedings.org
Pages597-609
Number of pages13
Edition3
ISBN (Electronic)9783893180912, 9783893180943, 9783893180950
DOIs
StatePublished - 18 Mar 2024
Event27th International Conference on Extending Database Technology, EDBT 2024 - Paestum, Italy
Duration: 25 Mar 202428 Mar 2024

Publication series

NameAdvances in Database Technology - EDBT
Number3
Volume27
ISSN (Electronic)2367-2005

Conference

Conference27th International Conference on Extending Database Technology, EDBT 2024
Country/TerritoryItaly
CityPaestum
Period25/03/2428/03/24

Funding

FundersFunder number
European Research Council
Intuit University Collaboration Program
Horizon 2020 Framework Programme804302
Horizon 2020 Framework Programme

    Fingerprint

    Dive into the research topics of 'Optimizing Counterfactual-based Analysis of Machine Learning Models Through Databases'. Together they form a unique fingerprint.

    Cite this