TY - GEN
T1 - Approximated summarization of data provenance
AU - Ainy, Eleanor
AU - Bourhis, Pierre
AU - Davidson, Susan B.
AU - Deutch, Daniel
AU - Milo, Tova
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/10/17
Y1 - 2015/10/17
N2 - Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand how the resulting information was derived. Data provenance has proven helpful in this respect, however, maintaining and presenting the full and exact provenance information may be infeasible due to its size and complexity. We therefore introduce the notion of approximated summarized provenance, which provides a compact representation of the provenance at the possible cost of information loss. Based on this notion, we present a novel provenance summarization algorithm which, based on the semantics of the underlying data and the intended use of provenance, outputs a summary of the input provenance. Experiments measure the conciseness and accuracy of the resulting provenance summaries, and improvement in provenance usage time.
AB - Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand how the resulting information was derived. Data provenance has proven helpful in this respect, however, maintaining and presenting the full and exact provenance information may be infeasible due to its size and complexity. We therefore introduce the notion of approximated summarized provenance, which provides a compact representation of the provenance at the possible cost of information loss. Based on this notion, we present a novel provenance summarization algorithm which, based on the semantics of the underlying data and the intended use of provenance, outputs a summary of the input provenance. Experiments measure the conciseness and accuracy of the resulting provenance summaries, and improvement in provenance usage time.
KW - Crowd-sourcing applications
KW - Provenance
KW - Provisioning
UR - http://www.scopus.com/inward/record.url?scp=84958232167&partnerID=8YFLogxK
U2 - 10.1145/2806416.2806429
DO - 10.1145/2806416.2806429
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84958232167
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 483
EP - 492
BT - CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 24th ACM International Conference on Information and Knowledge Management, CIKM 2015
Y2 - 19 October 2015 through 23 October 2015
ER -