Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences

Abstract The Adaptive Fuzzy C-Means with Logit Boost Distributed Clustering (AFC-LBDC) technique is introduced to enhance cancer detection promptly. The various conventional techniques often struggle to improve cancer detection due to their high complexity effectively. In contrast, the AFC-LBDC tech...

Full description

Saved in:
Bibliographic Details
Main Authors: K. Thenmozhi, M. Pyingkodi, V. S. Prakash, Kripa Josten, S. Manju Priya, J. Vennila
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Discover Applied Sciences
Subjects:
Online Access:https://doi.org/10.1007/s42452-025-07485-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849332146354782208
author K. Thenmozhi
M. Pyingkodi
V. S. Prakash
Kripa Josten
S. Manju Priya
J. Vennila
author_facet K. Thenmozhi
M. Pyingkodi
V. S. Prakash
Kripa Josten
S. Manju Priya
J. Vennila
author_sort K. Thenmozhi
collection DOAJ
description Abstract The Adaptive Fuzzy C-Means with Logit Boost Distributed Clustering (AFC-LBDC) technique is introduced to enhance cancer detection promptly. The various conventional techniques often struggle to improve cancer detection due to their high complexity effectively. In contrast, the AFC-LBDC technique groups similar protein sequences to get better accuracy in cancer detection. Initially, a large protein dataset is divided into ‘C’ number of local clusters using an adaptive Fuzzy C-Means distributed clustering approach. For any protein sequences that are not assigned to a group, the Bayesian probability is computed to find the higher chance of the protein sequence becoming a member of a specific cluster. The Logit Boost technique is applied to improve the clustering performance further, which combines the number of local clusters to make a global cluster. The proposed AFC-LBDC method demonstrates high accuracy rates of 96%, 88%, and 86% for the P53, BRCA2, and HRAS cancer datasets, respectively. Comparative evaluation reveals that AFC-LBDC reduces cancer detection time by up to 31% compared to existing methods, achieving a 20% and 31% reduction over the RaNC and IDMPhyChm-Ens methods for the P53 dataset, 19% and 31% for BRCA2, and 22% and 32% for HRAS. Likewise, the proposed method significantly lowers the false positive rate, with reductions of 27% and 39% for P53, 28% and 36% for BRCA2, and 23% and 31% for HRAS, compared to RaNC and IDMPhyChm-Ens, respectively. In addition, AFC-LBDC minimises space complexity by up to 44%, with 27% and 39% reductions for P53, 24% and 42% for BRCA2, and 22% and 44% for HRAS datasets. These results collectively indicate the superior performance and efficiency of AFC-LBDC in cancer gene detection. The global clustering result improves the cancer detection accuracy and minimises the false positive rate.
format Article
id doaj-art-e899bec86dee453eafeee5d84bacfd77
institution Kabale University
issn 3004-9261
language English
publishDate 2025-07-01
publisher Springer
record_format Article
series Discover Applied Sciences
spelling doaj-art-e899bec86dee453eafeee5d84bacfd772025-08-20T03:46:19ZengSpringerDiscover Applied Sciences3004-92612025-07-017812510.1007/s42452-025-07485-1Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequencesK. Thenmozhi0M. Pyingkodi1V. S. Prakash2Kripa Josten3S. Manju Priya4J. Vennila5School of Computer Science and Engineering, RV UniversityDepartment of MCA, Kongu Engineering CollegeDepartment of Computer Science, Kristu Jayanti College (Autonomous)Manipal College of Health Professions, Manipal Academy of Higher EducationSchool of Computer Science and Applications, Reva UniversityStatistics, Manipal College of Health Professions, Manipal Academy of Higher EducationAbstract The Adaptive Fuzzy C-Means with Logit Boost Distributed Clustering (AFC-LBDC) technique is introduced to enhance cancer detection promptly. The various conventional techniques often struggle to improve cancer detection due to their high complexity effectively. In contrast, the AFC-LBDC technique groups similar protein sequences to get better accuracy in cancer detection. Initially, a large protein dataset is divided into ‘C’ number of local clusters using an adaptive Fuzzy C-Means distributed clustering approach. For any protein sequences that are not assigned to a group, the Bayesian probability is computed to find the higher chance of the protein sequence becoming a member of a specific cluster. The Logit Boost technique is applied to improve the clustering performance further, which combines the number of local clusters to make a global cluster. The proposed AFC-LBDC method demonstrates high accuracy rates of 96%, 88%, and 86% for the P53, BRCA2, and HRAS cancer datasets, respectively. Comparative evaluation reveals that AFC-LBDC reduces cancer detection time by up to 31% compared to existing methods, achieving a 20% and 31% reduction over the RaNC and IDMPhyChm-Ens methods for the P53 dataset, 19% and 31% for BRCA2, and 22% and 32% for HRAS. Likewise, the proposed method significantly lowers the false positive rate, with reductions of 27% and 39% for P53, 28% and 36% for BRCA2, and 23% and 31% for HRAS, compared to RaNC and IDMPhyChm-Ens, respectively. In addition, AFC-LBDC minimises space complexity by up to 44%, with 27% and 39% reductions for P53, 24% and 42% for BRCA2, and 22% and 44% for HRAS datasets. These results collectively indicate the superior performance and efficiency of AFC-LBDC in cancer gene detection. The global clustering result improves the cancer detection accuracy and minimises the false positive rate.https://doi.org/10.1007/s42452-025-07485-1Protein sequencesCancer detectionAdaptive Fuzzy C-Means clusteringJaccard similarityLogit boost techniqueGlobal cluster
spellingShingle K. Thenmozhi
M. Pyingkodi
V. S. Prakash
Kripa Josten
S. Manju Priya
J. Vennila
Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
Discover Applied Sciences
Protein sequences
Cancer detection
Adaptive Fuzzy C-Means clustering
Jaccard similarity
Logit boost technique
Global cluster
title Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
title_full Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
title_fullStr Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
title_full_unstemmed Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
title_short Adaptive Fuzzy C-Means with logit boost distributed clustering for cancer detection with protein sequences
title_sort adaptive fuzzy c means with logit boost distributed clustering for cancer detection with protein sequences
topic Protein sequences
Cancer detection
Adaptive Fuzzy C-Means clustering
Jaccard similarity
Logit boost technique
Global cluster
url https://doi.org/10.1007/s42452-025-07485-1
work_keys_str_mv AT kthenmozhi adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences
AT mpyingkodi adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences
AT vsprakash adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences
AT kripajosten adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences
AT smanjupriya adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences
AT jvennila adaptivefuzzycmeanswithlogitboostdistributedclusteringforcancerdetectionwithproteinsequences