Improving doublet cell removal efficiency through multiple algorithm runs
Doublets are a key confounding factor in the analysis of scRNA-seq data, as they can interfere with differential expression analysis and disrupt developmental trajectories. However, due to the randomness of the algorithms, most doublet removal methods still leave a certain proportion of doublets aft...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-01-01
|
Series: | Computational and Structural Biotechnology Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S200103702500011X |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Doublets are a key confounding factor in the analysis of scRNA-seq data, as they can interfere with differential expression analysis and disrupt developmental trajectories. However, due to the randomness of the algorithms, most doublet removal methods still leave a certain proportion of doublets after application. In this study, we proposed a multi-round doublet removal (MRDR) strategy, that ran the algorithm in cycles multiple times to effectively reduce randomness while enhancing the effectiveness of doublet removal. We evaluated the MRDR strategy in 14 real-world datasets, 29 barcoded scRNA-seq datasets, and 106 synthetic datasets with four popular doublet detection tools, including DoubletFinder, cxds, bcds, and hybrid. We found that in real-world datasets, the DoubletFinder had a better performance in MRDR strategy compared to a single removal of doublets and the recall rate improved by 50 % for two rounds of doublet removal compared to one round, and the performance of the other three doublet algorithms improved the ROC by about 0.04. In barcoded scRNA-seq datasets, we found that using cxds for two rounds of doublet removal yielded the best results. Subsequently, in simulated datasets, we proved that the multi-round removal strategy was more effective in removing doublets than a single removal, with cxds showing the best results when applied twice, and the ROC of the four methods during the two rounds of removal improved by at least 0.05 compared to single removal. Finally, compared to running the algorithm once, we found that the MRDR strategy was more beneficial for differential gene expression analysis and cell trajectory inference when using default analysis parameters. Overall, we proved that the MRDR strategy was more effective in removing doublets and advantageous for downstream analyses, and the strategy could be incorporated into the standard analysis pipeline for scRNA-seq experiments and recommend using cxds to remove doublets through two rounds of algorithm iteration. |
---|---|
ISSN: | 2001-0370 |