Multi-modal contrastive learning of subcellular organization using DICE

Rami Nasser, Leah V. Schaffer, Trey Ideker, Roded Sharan

Research output: Contribution to journalArticlepeer-review

Abstract

The data deluge in biology calls for computational approaches that can integrate multiple datasets of different types to build a holistic view of biological processes or structures of interest. An emerging paradigm in this domain is the unsupervised learning of data embeddings that can be used for downstream clustering and classification tasks. While such approaches for integrating data of similar types are becoming common, there is scarcer work on consolidating different data modalities such as network and image information. Here, we introduce DICE (Data Integration through Contrastive Embedding), a contrastive learning model for multi-modal data integration. We apply this model to study the subcellular organization of proteins by integrating protein-protein interaction data and protein image data measured in HEK293 cells. We demonstrate the advantage of data integration over any single modality and show that our framework outperforms previous integration approaches. Availability: https://github.com/raminass/protein-contrastive Contact: [email protected].

Original languageEnglish
Pages (from-to)ii105-ii110
JournalBioinformatics
Volume40
Issue number2
DOIs
StatePublished - 1 Sep 2024

Fingerprint

Dive into the research topics of 'Multi-modal contrastive learning of subcellular organization using DICE'. Together they form a unique fingerprint.

Cite this