Characterizing the sample complexity of pure private learners

Amos Beimel, Kobbi Nissim, Uri Stemmer

Research output: Contribution to journalArticlepeer-review

16 Scopus citations


Kasiviswanathan et al. (FOCS 2008) defined private learning as a combination of PAC learning and differential privacy. Informally, a private learner is applied to a collection of labeled individual information and outputs a hypothesis while preserving the privacy of each individual. Kasiviswanathan et al. left open the question of characterizing the sample complexity of private learners. We give a combinatorial characterization of the sample size sufficient and necessary to learn a class of concepts under pure differential privacy. This characterization is analogous to the well known characterization of the sample complexity of non-private learning in terms of the VC dimension of the concept class. We introduce the notion of probabilistic representation of a concept class, and our new complexity measure RepDim corresponds to the size of the smallest probabilistic representation of the concept class. We show that any private learning algorithm for a concept class C with sample complexity m implies RepDim(C) = O(m), and that there exists a private learning algorithm with sample complexity m = O(RepDim(C)). We further demonstrate that a similar characterization holds for the database size needed for computing a large class of optimization problems under pure differential privacy, and also for the well studied problem of private data release.

Original languageEnglish
JournalJournal of Machine Learning Research
StatePublished - 1 Sep 2019
Externally publishedYes


FundersFunder number
Ben Gurion University
Google Ltd
National Sleep Foundation1565387
Israel Science Foundation152/17


    • Differential privacy
    • PAC learning
    • Sample complexity


    Dive into the research topics of 'Characterizing the sample complexity of pure private learners'. Together they form a unique fingerprint.

    Cite this