TY - GEN
T1 - LearnShapley
AU - Arad, Dana
AU - Deutch, Daniel
AU - Frost, Nave
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - To explain query results, a recent line of work has proposed to leverage the game-theoretic notion of Shapley values to quantify the contribution of each input fact to each result. Despite significant recent breakthroughs improving the complexity of computing Shapley values in query answering, the computation remains quite costly. To this end, we propose an approach that aims at ranking input facts based on their (hidden) Shapley values. Our method utilizes a repository of queries over the same database for which we do store exact Shapley values. Intuitively, some queries bear similarity in the ways they transform data, and consequently in the contribution of database facts to their outputs. In this manner, given a new query and a query result, we can learn and predict the ranking of contributing facts. Our contributions are three-fold. First, we introduce DBShap, a curated dataset of queries and query results, along with the contributing facts and respective Shapley values. Second, we define the task of predicting the ranking of facts contribution w.r.t a query and query result. Finally, we propose a solution for the prediction task based on BERT.
AB - To explain query results, a recent line of work has proposed to leverage the game-theoretic notion of Shapley values to quantify the contribution of each input fact to each result. Despite significant recent breakthroughs improving the complexity of computing Shapley values in query answering, the computation remains quite costly. To this end, we propose an approach that aims at ranking input facts based on their (hidden) Shapley values. Our method utilizes a repository of queries over the same database for which we do store exact Shapley values. Intuitively, some queries bear similarity in the ways they transform data, and consequently in the contribution of database facts to their outputs. In this manner, given a new query and a query result, we can learn and predict the ranking of contributing facts. Our contributions are three-fold. First, we introduce DBShap, a curated dataset of queries and query results, along with the contributing facts and respective Shapley values. Second, we define the task of predicting the ranking of facts contribution w.r.t a query and query result. Finally, we propose a solution for the prediction task based on BERT.
KW - language model
KW - machine learning
KW - shapley value
UR - http://www.scopus.com/inward/record.url?scp=85140822929&partnerID=8YFLogxK
U2 - 10.1145/3511808.3557204
DO - 10.1145/3511808.3557204
M3 - פרסום בספר כנס
AN - SCOPUS:85140822929
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4788
EP - 4792
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 17 October 2022 through 21 October 2022
ER -