As research products expand to include structured datasets, the challenge arises of how to automatically generate cita-tions to the results of arbitrary queries against such datasets. Previous work explored this problem in the context of con-junctive queries and views using a Rewriting-Based Model (RBM). However, an increasing number of scientific queries are aggregate, e.g. statistical summaries of the underlying data, for which the RBM cannot be easily extended. In this paper, we show how a Provenance-Based Model (PBM) can be leveraged to 1) generate citations to conjunctive as well as aggregate queries and views; 2) associate citations with indi-vidual result tuples to enable arbitrary subsets of the result set to be cited (fine-grained citations); and 3) be optimized to return citations in acceptable time. Our implementation of PBM in ProvCite shows that it not only handles a larger class of queries and views than RBM, but can outperform it when restricted to conjunctive views in some cases.
|Number of pages||14|
|Journal||Proceedings of the VLDB Endowment|
|State||Published - 2018|
|Event||45th International Conference on Very Large Data Bases, VLDB 2019 - Los Angeles, United States|
Duration: 26 Aug 2017 → 30 Aug 2017