OREO: Detection of Cherry-picked Generalizations

Yin Lin, H. V. Jagadish, Brit Youngmann, Tova Milo, Yuval Moskovitch

Research output: Contribution to journalConference articlepeer-review

Abstract

Data analytics often make sense of large data sets by generalization: aggregating from the detailed data to a more general context. Given a dataset, misleading generalizations can sometimes be drawn from a cherry-picked level of aggregation to obscure substantial subgroups that oppose the generalization. Our goal is to detect and explain cherry-picked generalizations by refining the corresponding aggregate queries. We demonstrate OREO, a system to compute a support score of the given statement to quantify the quality of the generalization; that is, whether the aggregated result is an accurate reflection of the data. To better understand the resulting score, our system also identifies significant counterexamples and alternative statements that better represent the data at hand. We will demonstrate the utility of OREO for investigating generalizations, by interacting with the VLDB’22 participants who will use the OREO interface for statement validation and explanation.

Original languageEnglish
Pages (from-to)3570-3573
Number of pages4
JournalProceedings of the VLDB Endowment
Volume15
Issue number12
DOIs
StatePublished - 2022
Event48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia
Duration: 5 Sep 20229 Sep 2022

Fingerprint

Dive into the research topics of 'OREO: Detection of Cherry-picked Generalizations'. Together they form a unique fingerprint.

Cite this