scSMD: a deep learning method for accurate clustering of single cells based on auto-encoder

Abstract Background Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoxu Cui, Renkai Wu, Yinghao Liu, Peizhan Chen, Qing Chang, Pengchen Liang, Changyu He
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06047-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study explores the application of deep learning to single-cell data clustering, with a particular focus on managing sparse, high-dimensional data. Results We propose the SMD deep learning model, which integrates nonlinear dimensionality reduction techniques with a porous dilated attention gate component. Built upon a convolutional autoencoder and informed by the negative binomial distribution, the SMD model efficiently captures essential cell clustering features and dynamically adjusts feature weights. Comprehensive evaluation on both public datasets and proprietary osteosarcoma data highlights the SMD model’s efficacy in achieving precise classifications for single-cell data clustering, showcasing its potential for advanced transcriptomic analysis. Conclusion This study underscores the potential of deep learning-specifically the SMD model-in advancing single-cell RNA sequencing data analysis. By integrating innovative computational techniques, the SMD model provides a powerful framework for unraveling cellular complexities, enhancing our understanding of biological processes, and elucidating disease mechanisms. The code is available from  https://github.com/xiaoxuc/scSMD .
ISSN:1471-2105