A new dimensionality reduction technique based on the Wavelet Transform for cancer classification

Abstract Problem DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of can...

Full description

Saved in:
Bibliographic Details
Main Authors: Lisardo Fernández, Mariano Pérez, Juan M. Orduña, José M. Alcaraz
Format: Article
Language:English
Published: SpringerOpen 2025-01-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-024-01039-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585650992316416
author Lisardo Fernández
Mariano Pérez
Juan M. Orduña
José M. Alcaraz
author_facet Lisardo Fernández
Mariano Pérez
Juan M. Orduña
José M. Alcaraz
author_sort Lisardo Fernández
collection DOAJ
description Abstract Problem DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of cancer. Although the advent of methylation microarrays such as the HumanMethylation450 platform has greatly reduced the dimensionality of the problem from billions to 450K positions, this data size is still too large to be processed by machine learning algorithms for cancer prediction and classification. Aim In the particular case of methylation, an efficient dimensionality reduction technique should also preserve the spatial information of the original data in order to properly predict and classify cancer. Method This work proposes a new approach for data dimensionality reduction technique based on the Discrete Wavelet Transform (DWT), which preserves spatial information. We have evaluated the proposed technique with a dataset collected from the most important cancer databases according to their social impact, and we have compared our proposal to five well-known dimensionality reduction techniques: PCA, ReliefF, Isomap, LLE and UMAP. Results The performance evaluation results show that the proposed technique significantly reduces both the computational resources and the execution time required for dimensionality reduction. In addition, it significantly improves the accuracy achieved in the classification by a support vector machine when it uses as input data the resulting dataset yielded by each technique. Conclusions The proposed approach based on the DWT can be considered as an efficient alternative for those cases where dimensionality reduction must preserve spatial information.
format Article
id doaj-art-92ffa877cd6d4ef88f6d621818968819
institution Kabale University
issn 2196-1115
language English
publishDate 2025-01-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj-art-92ffa877cd6d4ef88f6d6218189688192025-01-26T12:37:42ZengSpringerOpenJournal of Big Data2196-11152025-01-0112112310.1186/s40537-024-01039-9A new dimensionality reduction technique based on the Wavelet Transform for cancer classificationLisardo Fernández0Mariano Pérez1Juan M. Orduña2José M. Alcaraz3Departamento de Informática, Universidad de ValenciaDepartamento de Informática, Universidad de ValenciaDepartamento de Informática, Universidad de ValenciaSchool of Computing, Engineering and Physical Sciences, University of the West of ScotlandAbstract Problem DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of cancer. Although the advent of methylation microarrays such as the HumanMethylation450 platform has greatly reduced the dimensionality of the problem from billions to 450K positions, this data size is still too large to be processed by machine learning algorithms for cancer prediction and classification. Aim In the particular case of methylation, an efficient dimensionality reduction technique should also preserve the spatial information of the original data in order to properly predict and classify cancer. Method This work proposes a new approach for data dimensionality reduction technique based on the Discrete Wavelet Transform (DWT), which preserves spatial information. We have evaluated the proposed technique with a dataset collected from the most important cancer databases according to their social impact, and we have compared our proposal to five well-known dimensionality reduction techniques: PCA, ReliefF, Isomap, LLE and UMAP. Results The performance evaluation results show that the proposed technique significantly reduces both the computational resources and the execution time required for dimensionality reduction. In addition, it significantly improves the accuracy achieved in the classification by a support vector machine when it uses as input data the resulting dataset yielded by each technique. Conclusions The proposed approach based on the DWT can be considered as an efficient alternative for those cases where dimensionality reduction must preserve spatial information.https://doi.org/10.1186/s40537-024-01039-9Dimensionality reductionCancer classificationDNA methylation analysisWavelet TransformMachine learning classification
spellingShingle Lisardo Fernández
Mariano Pérez
Juan M. Orduña
José M. Alcaraz
A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
Journal of Big Data
Dimensionality reduction
Cancer classification
DNA methylation analysis
Wavelet Transform
Machine learning classification
title A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
title_full A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
title_fullStr A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
title_full_unstemmed A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
title_short A new dimensionality reduction technique based on the Wavelet Transform for cancer classification
title_sort new dimensionality reduction technique based on the wavelet transform for cancer classification
topic Dimensionality reduction
Cancer classification
DNA methylation analysis
Wavelet Transform
Machine learning classification
url https://doi.org/10.1186/s40537-024-01039-9
work_keys_str_mv AT lisardofernandez anewdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT marianoperez anewdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT juanmorduna anewdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT josemalcaraz anewdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT lisardofernandez newdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT marianoperez newdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT juanmorduna newdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification
AT josemalcaraz newdimensionalityreductiontechniquebasedonthewavelettransformforcancerclassification