A model for fine-grained data citation

Susan B. Davidson, Daniel Deutch, Tova Milo, Gianmaria Silvello

Research output: Contribution to conferencePaperpeer-review

Abstract

An increasing amount of information is being collected in structured, evolving, curated databases, driving the question of how information extracted from such datasets via queries should be cited. Unlike traditional research products, such books and journals, which have a fixed granularity, data citation is a challenge because the granularity varies. Different portions of the database, with varying granularity, may have different citations. Furthermore, there are an infinite number of queries over a database, each accessing and generating different subsets of the database, so we cannot hope to explicitly attach a citation to every possible result set and/or query. We present the novel problem of automatically generating citations for general queries over a relational database, and explore a solution based on a set of citation views, each of which attaches a citation to a view of the database. Citation views are then used to automatically construct citations for general queries. Our approach draws inspiration from results in two areas, query rewriting using views and database provenance and combines them in a robust model. We then discuss open issues in developing a practical solution to this challenging problem.

Original languageEnglish
StatePublished - 2017
Event8th Biennial Conference on Innovative Data Systems Research, CIDR 2017 - Santa Cruz, United States
Duration: 8 Jan 201711 Jan 2017

Conference

Conference8th Biennial Conference on Innovative Data Systems Research, CIDR 2017
Country/TerritoryUnited States
CitySanta Cruz
Period8/01/1711/01/17

Fingerprint

Dive into the research topics of 'A model for fine-grained data citation'. Together they form a unique fingerprint.

Cite this