Methylation risk scores are associated with a collection of phenotypes within electronic health record systems

Mike Thompson*, Brian L. Hill*, Nadav Rakocz, Jeffrey N. Chiang, Daniel Geschwind, Sriram Sankararaman, Ira Hofer, Maxime Cannesson, Noah Zaitlen, Eran Halperin*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


Inference of clinical phenotypes is a fundamental task in precision medicine, and has therefore been heavily investigated in recent years in the context of electronic health records (EHR) using a large arsenal of machine learning techniques, as well as in the context of genetics using polygenic risk scores (PRS). In this work, we considered the epigenetic analog of PRS, methylation risk scores (MRS), a linear combination of methylation states. We measured methylation across a large cohort (n = 831) of diverse samples in the UCLA Health biobank, for which both genetic and complete EHR data are available. We constructed MRS for 607 phenotypes spanning diagnoses, clinical lab tests, and medication prescriptions. When added to a baseline set of predictive features, MRS significantly improved the imputation of 139 outcomes, whereas the PRS improved only 22 (median improvement for methylation 10.74%, 141.52%, and 15.46% in medications, labs, and diagnosis codes, respectively, whereas genotypes only improved the labs at a median increase of 18.42%). We added significant MRS to state-of-the-art EHR imputation methods that leverage the entire set of medical records, and found that including MRS as a medical feature in the algorithm significantly improves EHR imputation in 37% of lab tests examined (median R2 increase 47.6%). Finally, we replicated several MRS in multiple external studies of methylation (minimum p-value of 2.72 × 10−7) and replicated 22 of 30 tested MRS internally in two separate cohorts of different ethnicity. Our publicly available results and weights show promise for methylation risk scores as clinical and scientific tools.

Original languageEnglish
Article number50
Journalnpj Genomic Medicine
Issue number1
StatePublished - Dec 2022
Externally publishedYes


FundersFunder number
Merck Pharmaceuticals
National Science Foundation1705197
National Institutes of HealthT32HG002536
National Human Genome Research InstituteHG010505-02
Chan Zuckerberg InitiativeU01HG009080, 1I01CX002011, U01MH126798, R01ES029929, 1R01HG011345, CZF2019-002449, R01HL155024, R01MH125252, U01HG012079


    Dive into the research topics of 'Methylation risk scores are associated with a collection of phenotypes within electronic health record systems'. Together they form a unique fingerprint.

    Cite this