Deep feature extraction of single-cell transcriptomes by generative adversarial network
Motivation: Single-cell RNA-sequencing (scRNA-seq) offers the opportunity to dissect heterogeneous
cellular compositions and interrogate the cell-type-specific gene expression patterns across diverse
conditions. However, batch effects such as laboratory conditions and individual-variability hinder their
usage in cross-condition designs.
Results: Here, we present a single-cell Generative Adversarial Network (scGAN) to simultaneously
acquire patterns from raw data while minimizing the confounding effect driven by technical artifacts or
other factors inherent to the data. Specifically, scGAN models the data likelihood of the raw scRNA-seq
counts by projecting each cell onto a latent embedding. Meanwhile, scGAN attempts to minimize the
correlation between the latent embeddings and the batch labels across all cells. We demonstrate scGAN on three public scRNA-seq datasets and show that our method confers superior performance over the state-of-the-art methods in forming clusters of known cell types and identifying known psychiatric genes that are associated with major depressive disorder.