ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification

<b>Background:</b> Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserv...

Full description

Saved in:
Bibliographic Details
Main Authors: Özgen Arslan Solmaz, Burak Tasci
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/12/1507
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849472451297148928
author Özgen Arslan Solmaz
Burak Tasci
author_facet Özgen Arslan Solmaz
Burak Tasci
author_sort Özgen Arslan Solmaz
collection DOAJ
description <b>Background:</b> Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserver variability. Recent advances in deep learning, particularly transformer-based models, offer promising tools for improving diagnostic accuracy. <b>Methods:</b> We propose ViSwNeXtNet, a novel patch-wise ensemble framework that integrates three transformer-based architectures—ConvNeXt-Tiny, Swin-Tiny, and ViT-Base—for deep feature extraction. Features from each model (12,288 per model) were concatenated into a 36,864-dimensional vector and refined using iterative neighborhood component analysis (INCA) to select the most discriminative 565 features. A quadratic SVM classifier was trained using these selected features. The model was evaluated on two datasets: (1) a custom-collected dataset consisting of 516 intestinal metaplasia cases and 521 control cases, and (2) the public GasHisSDB dataset, which includes 20,160 normal and 13,124 abnormal H&E-stained image patches of size 160 × 160 pixels. <b>Results:</b> On the collected dataset, the proposed method achieved 94.41% accuracy, 94.63% sensitivity, and 94.40% F1 score. On the GasHisSDB dataset, it reached 99.20% accuracy, 99.39% sensitivity, and 99.16% F1 score, outperforming individual backbone models and demonstrating strong generalizability across datasets. <b>Conclusions:</b> ViSwNeXtNet successfully combines local, regional, and global representations of tissue structure through an ensemble of transformer-based models. The addition of INCA-based feature selection significantly enhances classification performance while reducing dimensionality. These findings suggest the method’s potential for integration into clinical pathology workflows. Future work will focus on multiclass classification, multicenter validation, and integration of explainable AI techniques.
format Article
id doaj-art-1fc6561baef343b59d01b876ecbaf5d6
institution Kabale University
issn 2075-4418
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj-art-1fc6561baef343b59d01b876ecbaf5d62025-08-20T03:24:32ZengMDPI AGDiagnostics2075-44182025-06-011512150710.3390/diagnostics15121507ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology ClassificationÖzgen Arslan Solmaz0Burak Tasci1Clinic of Medical Pathology, Elazig Fethi Sekin City Hospital, Elazig 23280, TurkeyVocational School of Technical Sciences, Firat University, Elazig 23119, Turkey<b>Background:</b> Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserver variability. Recent advances in deep learning, particularly transformer-based models, offer promising tools for improving diagnostic accuracy. <b>Methods:</b> We propose ViSwNeXtNet, a novel patch-wise ensemble framework that integrates three transformer-based architectures—ConvNeXt-Tiny, Swin-Tiny, and ViT-Base—for deep feature extraction. Features from each model (12,288 per model) were concatenated into a 36,864-dimensional vector and refined using iterative neighborhood component analysis (INCA) to select the most discriminative 565 features. A quadratic SVM classifier was trained using these selected features. The model was evaluated on two datasets: (1) a custom-collected dataset consisting of 516 intestinal metaplasia cases and 521 control cases, and (2) the public GasHisSDB dataset, which includes 20,160 normal and 13,124 abnormal H&E-stained image patches of size 160 × 160 pixels. <b>Results:</b> On the collected dataset, the proposed method achieved 94.41% accuracy, 94.63% sensitivity, and 94.40% F1 score. On the GasHisSDB dataset, it reached 99.20% accuracy, 99.39% sensitivity, and 99.16% F1 score, outperforming individual backbone models and demonstrating strong generalizability across datasets. <b>Conclusions:</b> ViSwNeXtNet successfully combines local, regional, and global representations of tissue structure through an ensemble of transformer-based models. The addition of INCA-based feature selection significantly enhances classification performance while reducing dimensionality. These findings suggest the method’s potential for integration into clinical pathology workflows. Future work will focus on multiclass classification, multicenter validation, and integration of explainable AI techniques.https://www.mdpi.com/2075-4418/15/12/1507intestinal metaplasiahistopathologytransformer networksViTConvNeXtSwin transformer
spellingShingle Özgen Arslan Solmaz
Burak Tasci
ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
Diagnostics
intestinal metaplasia
histopathology
transformer networks
ViT
ConvNeXt
Swin transformer
title ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
title_full ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
title_fullStr ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
title_full_unstemmed ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
title_short ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification
title_sort viswnextnet deep patch wise ensemble of vision transformers and convnext for robust binary histopathology classification
topic intestinal metaplasia
histopathology
transformer networks
ViT
ConvNeXt
Swin transformer
url https://www.mdpi.com/2075-4418/15/12/1507
work_keys_str_mv AT ozgenarslansolmaz viswnextnetdeeppatchwiseensembleofvisiontransformersandconvnextforrobustbinaryhistopathologyclassification
AT buraktasci viswnextnetdeeppatchwiseensembleofvisiontransformersandconvnextforrobustbinaryhistopathologyclassification