The limits of post-selection generalization

Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, Jonathan Ullman

Research output: Contribution to journalConference articlepeer-review

Abstract

While statistics and machine learning offers numerous methods for ensuring generalization, these methods often fail in the presence of post selection-the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure a property called post hoc generalization (Cummings et al., COLT'16), which says that no person when given the output of the algorithm should be able to find any statistic for which the data differs significantly from the population it came from. In this work we show several limitations on the power of algorithms satisfying post hoc generalization. First, we show a tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries, showing a strong barrier to progress in post selection data analysis. Second, we show that post hoc generalization is not closed under composition, despite many examples of such algorithms exhibiting strong composition properties.

Original languageEnglish
Pages (from-to)6400-6409
Number of pages10
JournalAdvances in Neural Information Processing Systems
Volume2018-December
StatePublished - 2018
Externally publishedYes
Event32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada
Duration: 2 Dec 20188 Dec 2018

Fingerprint

Dive into the research topics of 'The limits of post-selection generalization'. Together they form a unique fingerprint.

Cite this