TY - JOUR
T1 - Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework
AU - Zeng, Xiangxiang
AU - Xiang, Hongxin
AU - Yu, Linhui
AU - Wang, Jianmin
AU - Li, Kenli
AU - Nussinov, Ruth
AU - Cheng, Feixiong
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Nature Limited.
PY - 2022/11
Y1 - 2022/11
N2 - The clinical efficacy and safety of a drug is determined by its molecular properties and targets in humans. However, proteome-wide evaluation of all compounds in humans, or even animal models, is challenging. In this study, we present an unsupervised pretraining deep learning framework, named ImageMol, pretrained on 10 million unlabelled drug-like, bioactive molecules, to predict molecular targets of candidate compounds. The ImageMol framework is designed to pretrain chemical representations from unlabelled molecular images on the basis of local and global structural characteristics of molecules from pixels. We demonstrate high performance of ImageMol in evaluation of molecular properties (that is, the drug’s metabolism, brain penetration and toxicity) and molecular target profiles (that is, beta-secretase enzyme and kinases) across 51 benchmark datasets. ImageMol shows high accuracy in identifying anti-SARS-CoV-2 molecules across 13 high-throughput experimental datasets from the National Center for Advancing Translational Sciences. Via ImageMol, we identified candidate clinical 3C-like protease inhibitors for potential treatment of COVID-19.
AB - The clinical efficacy and safety of a drug is determined by its molecular properties and targets in humans. However, proteome-wide evaluation of all compounds in humans, or even animal models, is challenging. In this study, we present an unsupervised pretraining deep learning framework, named ImageMol, pretrained on 10 million unlabelled drug-like, bioactive molecules, to predict molecular targets of candidate compounds. The ImageMol framework is designed to pretrain chemical representations from unlabelled molecular images on the basis of local and global structural characteristics of molecules from pixels. We demonstrate high performance of ImageMol in evaluation of molecular properties (that is, the drug’s metabolism, brain penetration and toxicity) and molecular target profiles (that is, beta-secretase enzyme and kinases) across 51 benchmark datasets. ImageMol shows high accuracy in identifying anti-SARS-CoV-2 molecules across 13 high-throughput experimental datasets from the National Center for Advancing Translational Sciences. Via ImageMol, we identified candidate clinical 3C-like protease inhibitors for potential treatment of COVID-19.
UR - http://www.scopus.com/inward/record.url?scp=85142108735&partnerID=8YFLogxK
U2 - 10.1038/s42256-022-00557-6
DO - 10.1038/s42256-022-00557-6
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85142108735
SN - 2522-5839
VL - 4
SP - 1004
EP - 1016
JO - Nature Machine Intelligence
JF - Nature Machine Intelligence
IS - 11
ER -