Gaussian lower bound for the information bottleneck limit

Amichai Painsky, Naftali Tishby

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


The Information Bottleneck (IB) is a conceptual method for extracting the most compact, yet informative, representation of a set of variables, with respect to the target. It generalizes the notion of minimal sufficient statistics from classical parametric statistics to a broader information-theoretic sense. The IB curve defines the optimal trade-off between representation complexity and its predictive power. Specifically, it is achieved by minimizing the level of mutual information (MI) between the representation and the original variables, subject to a minimal level of MI between the representation and the target. This problem is shown to be in general NP hard. One important exception is the multivariate Gaussian case, for which the Gaussian IB (GIB) is known to obtain an analytical closed form solution, similar to Canonical Correlation Analysis (CCA). In this work we introduce a Gaussian lower bound to the IB curve; we find an embedding of the data which maximizes its\Gaussian part", on which we apply the GIB. This embedding provides an efficient (and practical) representation of any arbitrary data-set (in the IB sense), which in addition holds the favorable properties of a Gaussian distribution. Importantly, we show that the optimal Gaussian embedding is bounded from above by non-linear CCA. This allows a fundamental limit for our ability to Gaussianize arbitrary data-sets and solve complex problems by linear methods.

Original languageEnglish
Pages (from-to)1-29
Number of pages29
JournalJournal of Machine Learning Research
StatePublished - 1 Apr 2018
Externally publishedYes


FundersFunder number
Israeli Center of Research Excellence in Algorithms


    • ACE
    • Canonical Correlations
    • Gaussianization
    • Infomax
    • Information Bottleneck
    • Mutual Information Maximization


    Dive into the research topics of 'Gaussian lower bound for the information bottleneck limit'. Together they form a unique fingerprint.

    Cite this