Constrained obfuscation of relational databases

Erez Shmueli*, Tomer Zrihen, Ran Yahalom, Tamir Tassa

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The need to share data often conflicts with privacy preservation. Data obfuscation attempts to overcome this conflict by modifying the original data while optimizing both privacy and utility measures. In this paper we introduce the concept of Constrained Obfuscation Problems (COPs) which formulate the task of obfuscating data stored in relational databases. The main idea behind COPs is that many obfuscation scenarios can be modeled as a data generation process which is constrained by a predefined set of rules. We demonstrate the flexibility of the COP definition by modeling several different obfuscation scenarios: Production Data Obfuscation for Application Testing (PDOAT), anonymization of relational data, and anonymization of social networks. We then suggest a general approach for solving COPs by reducing them into a set of Constrained Satisfaction Problems (CSPs). Such reduction enables the employment of the well-studied CSP framework in order to solve a wide range of complex rules. Some of the resulting CSPs may contain a large number of variables, which may make them intractable. In order to overcome such intractability issues, we present two useful heuristics that decompose such large CSPs into smaller tractable sub-CSPs. We also show how the well-known ℓ-diversity privacy measure can be incorporated into the COP framework in order to evaluate the privacy level of COP solutions. Finally, we evaluate the new method in terms of privacy, utility and execution time.

Original languageEnglish
Pages (from-to)35-62
Number of pages28
JournalInformation Sciences
Volume286
DOIs
StatePublished - 1 Dec 2014
Externally publishedYes

Keywords

  • Constrained Satisfaction Problem
  • Diversity
  • Obfuscation
  • Privacy Preserving Data Publishing

Fingerprint

Dive into the research topics of 'Constrained obfuscation of relational databases'. Together they form a unique fingerprint.

Cite this