A bayesian framework for estimating cell type composition from DNA methylation without the need for methylation reference

Elior Rahmani*, Regev Schweiger, Liat Shenhav, Eleazar Eskin, Eran Halperin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Genome-wide DNA methylation levels measured from a target tissue across a population have become ubiquitous over the last few years, as methylation status is suggested to hold great potential for better understanding the role of epigenetics. Different cell types are known to have different methylation profiles. Therefore, in the common scenario where methylation levels are collected from heterogeneous sources such as blood, convoluted signals are formed according to the cell type composition of the samples. Knowledge of the cell type proportions is important for statistical analysis, and it may provide novel biological insights and contribute to our understanding of disease biology. Since high resolution cell counting is costly and often logistically impractical to obtain in large studies, targeted methods that are inexpensive and practical for estimating cell proportions are needed. Although a supervised approach has been shown to provide reasonable estimates of cell proportions, this approach leverages scarce reference methylation data from sorted cells which are not available for most tissues and are not appropriate for any target population. Here, we introduce BayesCCE, a Bayesian semi-supervised method that leverages prior knowledge on the cell type composition distribution in the studied tissue. As we demonstrate, such prior information is substantially easier to obtain compared to appropriate reference methylation levels from sorted cells. Using real and simulated data, we show that our proposed method is able to construct a set of components, each corresponding to a single cell type, and together providing up to 50% improvement in correlation when compared with existing reference-free methods. We further make a design suggestion for future data collection efforts by showing that results can be further improved using cell count measurements for a small subset of individuals in the study sample or by incorporating external data of individuals with measured cell counts. Our approach provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.

Original languageEnglish
Title of host publicationResearch in Computational Molecular Biology - 21st Annual International Conference, RECOMB 2017, Proceedings
EditorsS.Cenk Sahinalp
PublisherSpringer Verlag
Pages207-223
Number of pages17
ISBN (Print)9783319569697
DOIs
StatePublished - 2017
Event21st Annual International Conference on Research in Computational Molecular Biology, RECOMB 2017 - Hong Kong, China
Duration: 3 May 20177 May 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10229 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Annual International Conference on Research in Computational Molecular Biology, RECOMB 2017
Country/TerritoryChina
CityHong Kong
Period3/05/177/05/17

Funding

FundersFunder number
Blavatnik Research Foundation
Edmond J. Safra Center for Bioinformatics
National Science Foundation1302448, 1065276, 1320589, 1331176
National Institutes of HealthR01-ES022282, R01-GM083198, R01-MH101782, R01-ES021801, U54EB020403
United States - Israel Binational Science Foundation2012304
Cook Family Foundation
Israel Science Foundation1425/13
Tel Aviv University

    Keywords

    • Bayesian model
    • Cell type composition
    • Cell type proportions
    • DNA methylation
    • Epigenetics
    • Tissue heterogeneity

    Fingerprint

    Dive into the research topics of 'A bayesian framework for estimating cell type composition from DNA methylation without the need for methylation reference'. Together they form a unique fingerprint.

    Cite this