TY - JOUR
T1 - Graph embedding and Gaussian mixture variational autoencoder network for end-to-end analysis of single-cell RNA sequencing data
AU - Xu, Junlin
AU - Xu, Jielin
AU - Meng, Yajie
AU - Lu, Changcheng
AU - Cai, Lijun
AU - Zeng, Xiangxiang
AU - Nussinov, Ruth
AU - Cheng, Feixiong
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2023/1/23
Y1 - 2023/1/23
N2 - Single-cell RNA sequencing (scRNA-seq) is a revolutionary technology to determine the precise gene expression of individual cells and identify cell heterogeneity and subpopulations. However, technical limitations of scRNA-seq lead to heterogeneous and sparse data. Here, we present autoCell, a deep-learning approach for scRNA-seq dropout imputation and feature extraction. autoCell is a variational autoencoding network that combines graph embedding and a probabilistic depth Gaussian mixture model to infer the distribution of high-dimensional, sparse scRNA-seq data. We validate autoCell on simulated datasets and biologically relevant scRNA-seq. We show that interpolation of autoCell improves the performance of existing tools in identifying cell developmental trajectories of human preimplantation embryos. We identify disease-associated astrocytes (DAAs) and reconstruct DAA-specific molecular networks and ligand-receptor interactions involved in cell-cell communications using Alzheimer's disease as a prototypical example. autoCell provides a toolbox for end-to-end analysis of scRNA-seq data, including visualization, clustering, imputation, and disease-specific gene network identification.
AB - Single-cell RNA sequencing (scRNA-seq) is a revolutionary technology to determine the precise gene expression of individual cells and identify cell heterogeneity and subpopulations. However, technical limitations of scRNA-seq lead to heterogeneous and sparse data. Here, we present autoCell, a deep-learning approach for scRNA-seq dropout imputation and feature extraction. autoCell is a variational autoencoding network that combines graph embedding and a probabilistic depth Gaussian mixture model to infer the distribution of high-dimensional, sparse scRNA-seq data. We validate autoCell on simulated datasets and biologically relevant scRNA-seq. We show that interpolation of autoCell improves the performance of existing tools in identifying cell developmental trajectories of human preimplantation embryos. We identify disease-associated astrocytes (DAAs) and reconstruct DAA-specific molecular networks and ligand-receptor interactions involved in cell-cell communications using Alzheimer's disease as a prototypical example. autoCell provides a toolbox for end-to-end analysis of scRNA-seq data, including visualization, clustering, imputation, and disease-specific gene network identification.
KW - Alzheimer's disease
KW - CP: systems biology
KW - deep learning
KW - disease-associated astrocyte
KW - scRNA-seq
KW - single cell/nuclei
KW - variational autoencoding network
UR - http://www.scopus.com/inward/record.url?scp=85148287949&partnerID=8YFLogxK
U2 - 10.1016/j.crmeth.2022.100382
DO - 10.1016/j.crmeth.2022.100382
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 36814845
AN - SCOPUS:85148287949
SN - 2667-2375
VL - 3
JO - Cell Reports Methods
JF - Cell Reports Methods
IS - 1
M1 - 100382
ER -