A copy number variation detection method based on OCSVM algorithm using multi strategies integration

Abstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the...

Full description

Saved in:
Bibliographic Details
Main Authors: Mengjiao Zhou, Jinxin Dong, Hua Jiang, Zuyao Zhao, Tianting Yuan
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-88143-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571767658381312
author Mengjiao Zhou
Jinxin Dong
Hua Jiang
Zuyao Zhao
Tianting Yuan
author_facet Mengjiao Zhou
Jinxin Dong
Hua Jiang
Zuyao Zhao
Tianting Yuan
author_sort Mengjiao Zhou
collection DOAJ
description Abstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the location of variant breakpoints, a new method called MSCNV (copy number variations detection method for multi-strategies integration based on a one-class support vector machine model) is proposed. MSCNV establishes a multi-signal channel that integrates three strategies: read depth, split read, and read pair. First, a one-class support vector machine algorithm is used to detect abnormal signals in read depth and mapping quality values to determine the rough CNV region. Then, the rough CNV region is filtered by using paired read signals to improve the precision of MSCNV method. Finally, MSCNV explores and recognizes tandem duplication regions, interspersed duplication regions, and loss regions. It uses split read signals to determine the precise location of mutation points and to determine the type of variation. Compared with Manta, FREEC, GROM-RD, Rsicnv, and CNVkit, MSCNV significantly improves the sensitivity, precision, F1-score, and overlap density score of CNV detection while reducing the boundary bias of the detection results.
format Article
id doaj-art-012716178e57419ca0c1a97c88ab753e
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-012716178e57419ca0c1a97c88ab753e2025-02-02T12:17:42ZengNature PortfolioScientific Reports2045-23222025-01-0115111910.1038/s41598-025-88143-9A copy number variation detection method based on OCSVM algorithm using multi strategies integrationMengjiao Zhou0Jinxin Dong1Hua Jiang2Zuyao Zhao3Tianting Yuan4School of Computer Science and Technology, Liaocheng UniversitySchool of Computer Science and Technology, Liaocheng UniversitySchool of Computer Science and Technology, Liaocheng UniversityOrthopedics Department, Liaocheng People’s HospitalSchool of Computer Science and Technology, Liaocheng UniversityAbstract Copy number variation (CNV) is an important part of human genetic variations, which is associated with various kinds of diseases. To tackle the limitations of traditional CNV detection methods, such as restricted detection types, high error rates, and challenges in precisely identifying the location of variant breakpoints, a new method called MSCNV (copy number variations detection method for multi-strategies integration based on a one-class support vector machine model) is proposed. MSCNV establishes a multi-signal channel that integrates three strategies: read depth, split read, and read pair. First, a one-class support vector machine algorithm is used to detect abnormal signals in read depth and mapping quality values to determine the rough CNV region. Then, the rough CNV region is filtered by using paired read signals to improve the precision of MSCNV method. Finally, MSCNV explores and recognizes tandem duplication regions, interspersed duplication regions, and loss regions. It uses split read signals to determine the precise location of mutation points and to determine the type of variation. Compared with Manta, FREEC, GROM-RD, Rsicnv, and CNVkit, MSCNV significantly improves the sensitivity, precision, F1-score, and overlap density score of CNV detection while reducing the boundary bias of the detection results.https://doi.org/10.1038/s41598-025-88143-9Copy number variationsNext-generation sequencing technologyOne class support vector machine algorithmRead depthSplit readPair-end mapping
spellingShingle Mengjiao Zhou
Jinxin Dong
Hua Jiang
Zuyao Zhao
Tianting Yuan
A copy number variation detection method based on OCSVM algorithm using multi strategies integration
Scientific Reports
Copy number variations
Next-generation sequencing technology
One class support vector machine algorithm
Read depth
Split read
Pair-end mapping
title A copy number variation detection method based on OCSVM algorithm using multi strategies integration
title_full A copy number variation detection method based on OCSVM algorithm using multi strategies integration
title_fullStr A copy number variation detection method based on OCSVM algorithm using multi strategies integration
title_full_unstemmed A copy number variation detection method based on OCSVM algorithm using multi strategies integration
title_short A copy number variation detection method based on OCSVM algorithm using multi strategies integration
title_sort copy number variation detection method based on ocsvm algorithm using multi strategies integration
topic Copy number variations
Next-generation sequencing technology
One class support vector machine algorithm
Read depth
Split read
Pair-end mapping
url https://doi.org/10.1038/s41598-025-88143-9
work_keys_str_mv AT mengjiaozhou acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT jinxindong acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT huajiang acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT zuyaozhao acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT tiantingyuan acopynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT mengjiaozhou copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT jinxindong copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT huajiang copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT zuyaozhao copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration
AT tiantingyuan copynumbervariationdetectionmethodbasedonocsvmalgorithmusingmultistrategiesintegration