TY - JOUR
T1 - Nonlinear canonical correlation analysis
T2 - A compressed representation approach
AU - Painsky, Amichai
AU - Feder, Meir
AU - Tishby, Naftali
N1 - Publisher Copyright:
© 2020 by the authors.
PY - 2020/2/1
Y1 - 2020/2/1
N2 - Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Nonlinear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the nonlinear CCA problem. However, it suffers from limited performance and an increasing computational burden when only a finite number of samples is available. In this work, we introduce an information-theoretic compressed representation framework for the nonlinear CCA problem (CRCCA), which extends the classical ACE approach. Our suggested framework seeks compact representations of the data that allow a maximal level of correlation. This way, we control the trade-off between the flexibility and the complexity of the model. CRCCA provides theoretical bounds and optimality conditions, as we establish fundamental connections to rate-distortion theory, the information bottleneck and remote source coding. In addition, it allows a soft dimensionality reduction, as the compression level is determined by the mutual information between the original noisy data and the extracted signals. Finally, we introduce a simple implementation of the CRCCA framework, based on lattice quantization.
AB - Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Nonlinear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the nonlinear CCA problem. However, it suffers from limited performance and an increasing computational burden when only a finite number of samples is available. In this work, we introduce an information-theoretic compressed representation framework for the nonlinear CCA problem (CRCCA), which extends the classical ACE approach. Our suggested framework seeks compact representations of the data that allow a maximal level of correlation. This way, we control the trade-off between the flexibility and the complexity of the model. CRCCA provides theoretical bounds and optimality conditions, as we establish fundamental connections to rate-distortion theory, the information bottleneck and remote source coding. In addition, it allows a soft dimensionality reduction, as the compression level is determined by the mutual information between the original noisy data and the extracted signals. Finally, we introduce a simple implementation of the CRCCA framework, based on lattice quantization.
KW - Alternating conditional expectation
KW - Canonical correlation analysis
KW - Dimensionality reduction
KW - Information bottleneck
KW - Remote source coding
UR - http://www.scopus.com/inward/record.url?scp=85080953649&partnerID=8YFLogxK
U2 - 10.3390/e22020208
DO - 10.3390/e22020208
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 33285982
AN - SCOPUS:85080953649
SN - 1099-4300
VL - 22
JO - Entropy
JF - Entropy
IS - 2
M1 - 208
ER -