Batch correction of single-cell sequencing data via an autoencoder architecture

Reut Danino, Iftach Nachman, Roded Sharan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: Technical differences between gene expression sequencing experiments can cause variations in the data in the form of batch effect biases. These do not represent true biological variations between samples and can lead to false conclusions or hinder the ability to integrate multiple datasets. Since there is a growing need for the joint analysis of single-cell sequencing datasets from different sources, there is also a need to correct the resulting batch effects while maintaining the true biological variations in the data. Results: We developed a semi-supervised deep learning architecture called Autoencoder-based Batch Correction (ABC) for integrating single-cell sequencing datasets. Our method removes batch effects through a guided process of data compression using supervised cell type classifier branches for biological signal retention. It aligns the different batches using an adversarial training approach. We comprehensively evaluate the performance of our method using four single-cell sequencing datasets and multiple measures for batch effect removal and biological variation conservation. ABC outperforms 10 state-of-the-art methods for this task including Seurat, scGen, ComBat, scanorama, scVI, scANVI, AutoClass, Harmony, scDREAMER, and CLEAR, correcting various types of batch effects while preserving intricate biological variations.

Original languageEnglish
Article numbervbad186
JournalBioinformatics Advances
Volume4
Issue number1
DOIs
StatePublished - 2024

Funding

FundersFunder number
IPMP2417/20
Israel Science Foundation

    Fingerprint

    Dive into the research topics of 'Batch correction of single-cell sequencing data via an autoencoder architecture'. Together they form a unique fingerprint.

    Cite this