Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization

The analysis of high-dimensional microarray gene expression data presents critical challenges, including excessive dimensionality, increased computational burden, and sensitivity to random initialization. Traditional optimization algorithms often produce inconsistent and suboptimal results, while fa...

Full description

Saved in:
Bibliographic Details
Main Authors: Sumet Mehta, Fei Han, Muhammad Sohail, Bhekisipho Twala, Asad Ullah, Fasee Ullah, Arfat Ahmad Khan, Qinghua Ling
Format: Article
Language:English
Published: PeerJ Inc. 2025-05-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-2872.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850239964702310400
author Sumet Mehta
Fei Han
Muhammad Sohail
Bhekisipho Twala
Asad Ullah
Fasee Ullah
Arfat Ahmad Khan
Qinghua Ling
author_facet Sumet Mehta
Fei Han
Muhammad Sohail
Bhekisipho Twala
Asad Ullah
Fasee Ullah
Arfat Ahmad Khan
Qinghua Ling
author_sort Sumet Mehta
collection DOAJ
description The analysis of high-dimensional microarray gene expression data presents critical challenges, including excessive dimensionality, increased computational burden, and sensitivity to random initialization. Traditional optimization algorithms often produce inconsistent and suboptimal results, while failing to preserve local data structures limiting both predictive accuracy and biological interpretability. To address these limitations, this study proposes an adaptive neighborhood-preserving multi-objective particle swarm optimization (ANPMOPSO) framework for gene selection. ANPMOPSO introduces four key innovations: (1) a weighted neighborhood-preserving ensemble embedding (WNPEE) technique for dimensionality reduction that retains local structure; (2) Sobol sequence (SS) initialization to enhance population diversity and convergence stability; (3) a differential evolution (DE)-based adaptive velocity update to dynamically balance exploration and exploitation; and (4) a novel ranking strategy that combines Pareto dominance with neighborhood preservation quality to prioritize biologically meaningful gene subsets. Experimental evaluations on six benchmark microarray datasets and eleven multi-modal test functions (MMFs) demonstrate that ANPMOPSO consistently outperforms state-of-the-art methods. For example, it achieves 100% classification accuracy on Leukemia and Small-Round-Blue-Cell Tumor (SRBCT) using only 3–5 genes, improving accuracy by 5–15% over competitors while reducing gene subsets by 40–60%. Additionally, on MMFs, ANPMOPSO attains superior hypervolume values (e.g., 1.0617 ± 0.2225 on MMF1, approximately 10–20% higher than competitors), confirming its robustness in balancing convergence and diversity. Although the method incurs higher training time due to its structural and adaptive components, it achieves a strong trade-off between computational cost and biological relevance, making it a promising tool for high-dimensional gene selection in bioinformatics.
format Article
id doaj-art-d8cb8ed7ffca45d981d63fe2bf288b77
institution OA Journals
issn 2376-5992
language English
publishDate 2025-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj-art-d8cb8ed7ffca45d981d63fe2bf288b772025-08-20T02:01:00ZengPeerJ Inc.PeerJ Computer Science2376-59922025-05-0111e287210.7717/peerj-cs.2872Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimizationSumet Mehta0Fei Han1Muhammad Sohail2Bhekisipho Twala3Asad Ullah4Fasee Ullah5Arfat Ahmad Khan6Qinghua Ling7School of Computer Science & Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, ChinaSchool of Computer Science & Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, ChinaDepartment of Computer Software Engineering, Military College of Signals, NUST, Islamabad, Islamabad, PakistanFaculty of Information and Communication Technology, Tshwane University of Technology, Pretoria West, Pretoria, South AfricaDepartment of Computer Software Engineering, Military College of Signals, NUST, Islamabad, Islamabad, PakistanThe Department of Computing, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul Ridzuan, MalaysiaDepartment of Computer Science, College of Computing, Khon Kaen University, Khon Kaen, Khon Kaen, ThailandSchool of Computer Science and Engineering, Jiangsu University of Science & Technology, Zhenjiang, Jiangsu, ChinaThe analysis of high-dimensional microarray gene expression data presents critical challenges, including excessive dimensionality, increased computational burden, and sensitivity to random initialization. Traditional optimization algorithms often produce inconsistent and suboptimal results, while failing to preserve local data structures limiting both predictive accuracy and biological interpretability. To address these limitations, this study proposes an adaptive neighborhood-preserving multi-objective particle swarm optimization (ANPMOPSO) framework for gene selection. ANPMOPSO introduces four key innovations: (1) a weighted neighborhood-preserving ensemble embedding (WNPEE) technique for dimensionality reduction that retains local structure; (2) Sobol sequence (SS) initialization to enhance population diversity and convergence stability; (3) a differential evolution (DE)-based adaptive velocity update to dynamically balance exploration and exploitation; and (4) a novel ranking strategy that combines Pareto dominance with neighborhood preservation quality to prioritize biologically meaningful gene subsets. Experimental evaluations on six benchmark microarray datasets and eleven multi-modal test functions (MMFs) demonstrate that ANPMOPSO consistently outperforms state-of-the-art methods. For example, it achieves 100% classification accuracy on Leukemia and Small-Round-Blue-Cell Tumor (SRBCT) using only 3–5 genes, improving accuracy by 5–15% over competitors while reducing gene subsets by 40–60%. Additionally, on MMFs, ANPMOPSO attains superior hypervolume values (e.g., 1.0617 ± 0.2225 on MMF1, approximately 10–20% higher than competitors), confirming its robustness in balancing convergence and diversity. Although the method incurs higher training time due to its structural and adaptive components, it achieves a strong trade-off between computational cost and biological relevance, making it a promising tool for high-dimensional gene selection in bioinformatics.https://peerj.com/articles/cs-2872.pdfMicroarray gene selectionMulti-objective optimizationParticle swarm optimizationNeighborhood preservation
spellingShingle Sumet Mehta
Fei Han
Muhammad Sohail
Bhekisipho Twala
Asad Ullah
Fasee Ullah
Arfat Ahmad Khan
Qinghua Ling
Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
PeerJ Computer Science
Microarray gene selection
Multi-objective optimization
Particle swarm optimization
Neighborhood preservation
title Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
title_full Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
title_fullStr Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
title_full_unstemmed Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
title_short Gene selection based on adaptive neighborhood-preserving multi-objective particle swarm optimization
title_sort gene selection based on adaptive neighborhood preserving multi objective particle swarm optimization
topic Microarray gene selection
Multi-objective optimization
Particle swarm optimization
Neighborhood preservation
url https://peerj.com/articles/cs-2872.pdf
work_keys_str_mv AT sumetmehta geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT feihan geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT muhammadsohail geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT bhekisiphotwala geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT asadullah geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT faseeullah geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT arfatahmadkhan geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization
AT qinghualing geneselectionbasedonadaptiveneighborhoodpreservingmultiobjectiveparticleswarmoptimization