A copy number variation detection method based on OCSVM algorithm using multi strategies integration
Abstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-88143-9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571767658381312 |
---|---|
author | Mengjiao Zhou Jinxin Dong Hua Jiang Zuyao Zhao Tianting Yuan |
author_facet | Mengjiao Zhou Jinxin Dong Hua Jiang Zuyao Zhao Tianting Yuan |
author_sort | Mengjiao Zhou |
collection | DOAJ |
description | Abstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the location of variant breakpoints, a new method called MSCNV (copy number variations detection method for multi-strategies integration based on a one-class support vector machine model) is proposed. MSCNV establishes a multi-signal channel that integrates three strategies: read depth, split read, and read pair. First, a one-class support vector machine algorithm is used to detect abnormal signals in read depth and mapping quality values to determine the rough CNV region. Then, the rough CNV region is filtered by using paired read signals to improve the precision of MSCNV method. Finally, MSCNV explores and recognizes tandem duplication regions, interspersed duplication regions, and loss regions. It uses split read signals to determine the precise location of mutation points and to determine the type of variation. Compared with Manta, FREEC, GROM-RD, Rsicnv, and CNVkit, MSCNV significantly improves the sensitivity, precision, F1-score, and overlap density score of CNV detection while reducing the boundary bias of the detection results. |
format | Article |
id | doaj-art-012716178e57419ca0c1a97c88ab753e |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-012716178e57419ca0c1a97c88ab753e2025-02-02T12:17:42ZengNature PortfolioScientific Reports2045-23222025-01-0115111910.1038/s41598-025-88143-9A copy number variation detection method based on OCSVM algorithm using multi strategies integrationMengjiao Zhou0Jinxin Dong1Hua Jiang2Zuyao Zhao3Tianting Yuan4School of Computer Science and Technology, Liaocheng UniversitySchool of Computer Science and Technology, Liaocheng UniversitySchool of Computer Science and Technology, Liaocheng UniversityOrthopedics Department, Liaocheng People’s HospitalSchool of Computer Science and Technology, Liaocheng UniversityAbstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the location of variant breakpoints, a new method called MSCNV (copy number variations detection method for multi-strategies integration based on a one-class support vector machine model) is proposed. MSCNV establishes a multi-signal channel that integrates three strategies: read depth, split read, and read pair. First, a one-class support vector machine algorithm is used to detect abnormal signals in read depth and mapping quality values to determine the rough CNV region. Then, the rough CNV region is filtered by using paired read signals to improve the precision of MSCNV method. Finally, MSCNV explores and recognizes tandem duplication regions, interspersed duplication regions, and loss regions. It uses split read signals to determine the precise location of mutation points and to determine the type of variation. Compared with Manta, FREEC, GROM-RD, Rsicnv, and CNVkit, MSCNV significantly improves the sensitivity, precision, F1-score, and overlap density score of CNV detection while reducing the boundary bias of the detection results.https://doi.org/10.1038/s41598-025-88143-9Copy number variationsNext-generation sequencing technologyOne class support vector machine algorithmRead depthSplit readPair-end mapping |
spellingShingle | Mengjiao Zhou Jinxin Dong Hua Jiang Zuyao Zhao Tianting Yuan A copy number variation detection method based on OCSVM algorithm using multi strategies integration Scientific Reports Copy number variations Next-generation sequencing technology One class support vector machine algorithm Read depth Split read Pair-end mapping |
title | A copy number variation detection method based on OCSVM algorithm using multi strategies integration |
title_full | A copy number variation detection method based on OCSVM algorithm using multi strategies integration |
title_fullStr | A copy number variation detection method based on OCSVM algorithm using multi strategies integration |
title_full_unstemmed | A copy number variation detection method based on OCSVM algorithm using multi strategies integration |
title_short | A copy number variation detection method based on OCSVM algorithm using multi strategies integration |
title_sort | copy number variation detection method based on ocsvm algorithm using multi strategies integration |
topic | Copy number variations Next-generation sequencing technology One class support vector machine algorithm Read depth Split read Pair-end mapping |
url | https://doi.org/10.1038/s41598-025-88143-9 |
work_keys_str_mv | AT mengjiaozhou acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT jinxindong acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT huajiang acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT zuyaozhao acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT tiantingyuan acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT mengjiaozhou copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT jinxindong copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT huajiang copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT zuyaozhao copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration AT tiantingyuan copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration |