Privacy by diversity in sequential releases of databases

Erez Shmueli*, Tamir Tassa

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

35 Scopus citations

Abstract

We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated by the purposes of the release. The goal is to protect the private information from adversaries who examine the entire sequential release. That scenario was studied in [32] and was further investigated in [28]. We revisit their privacy definitions, and suggest a significantly stronger adversarial assumption and privacy definition. We then present a sequential anonymization algorithm that achieves ℓ-diversity. The algorithm exploits the fact that different releases may include different attributes in order to reduce the information loss that the anonymization entails. Unlike the previous algorithms, ours is perfectly scalable as the runtime to compute the anonymization of each release is independent of the number of previous releases. In addition, we consider here the fully dynamic setting in which the different releases differ in the set of attributes as well as in the set of tuples. The advantages of our approach are demonstrated by extensive experimentation.

Original languageEnglish
Pages (from-to)344-372
Number of pages29
JournalInformation Sciences
Volume298
DOIs
StatePublished - 20 Mar 2015

Keywords

  • Anonymization
  • Continuous data publishing
  • Diversity
  • Multipartite graphs
  • Privacy preserving data publishing
  • Sequential release

Fingerprint

Dive into the research topics of 'Privacy by diversity in sequential releases of databases'. Together they form a unique fingerprint.

Cite this