Abstract
Differentially private algorithms for common metric aggregation tasks, such as clustering or averaging, often have limited practicality due to their complexity or to the large number of data points that is required for accurate results. We propose a simple and practical tool, FriendlyCore, that takes a set of points D from an unrestricted (pseudo) metric space as input. When D has effective diameter r, FriendlyCore returns a “stable” subset C ⊆ D that includes all points, except possibly a few outliers, and is guaranteed to have diameter r. FriendlyCore can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy. Surprisingly, FriendlyCore is light-weight with no dependence on the dimension. We empirically demonstrate its advantages in boosting the accuracy of mean estimation and clustering tasks such as k-means and k-GMM, outperforming tailored methods.
Original language | English |
---|---|
Pages (from-to) | 21828-21863 |
Number of pages | 36 |
Journal | Proceedings of Machine Learning Research |
Volume | 162 |
State | Published - 2022 |
Event | 39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States Duration: 17 Jul 2022 → 23 Jul 2022 |
Funding
Funders | Funder number |
---|---|
Blavatnik Family Foundation | |
Blavatnik Family Foundation | |
European Research Council | |
European Union’sHorizon 2020 research and innovation program | 882396 |
Israel Science Foundation | 1871/19, 1595-19, 993/17 |
Tel Aviv University | |
Yandex Initiative for Machine Learning |