Multi-CNN Deep Feature Fusion and Stacking Ensemble Classifier for Breast Ultrasound Lesion Classification

Objective: To develop and validate a robust machine learning model for classifying breast ultrasound images into benign, malignant, and normal categories, aiming to enhance diagnostic accuracy using advanced feature extraction and ensemble learning techniques. Methods: A dataset comprising 2233 ima...

Full description

Saved in:
Bibliographic Details
Main Authors: Kemal PANÇ, Sümeyye SEKMEN
Format: Article
Language:English
Published: Galenos Yayinevi 2025-08-01
Series:Forbes Tıp Dergisi
Subjects:
Online Access:https://forbestip.org/articles/multi-cnn-deep-feature-fusion-and-stacking-ensemble-classifier-for-breast-ultrasound-lesion-classification/doi/forbes.galenos.2025.02360
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective: To develop and validate a robust machine learning model for classifying breast ultrasound images into benign, malignant, and normal categories, aiming to enhance diagnostic accuracy using advanced feature extraction and ensemble learning techniques. Methods: A dataset comprising 2233 images from five public datasets was utilized. After masking regions of interest, deep features were extracted using pre-trained VGG16, ResNet50V2, and EfficientNetB3 models, and concatenated. A multi-step feature selection process involving principal component analysis, recursive feature elimination with LightGBM, and partial least squares discriminant analysis was applied. A stacking ensemble classifier, integrating LightGBM, XGBoost, CatBoost, and random forest with a logistic regression meta-learner, was trained using 5-fold cross-validation on a 75% training set (balanced with synthetic minority oversampling technique), and evaluated on a 25% test set. Results: The model achieved a macro average area under the curve-receiver operating characteristic (AUC-ROC) of 0.956 and an F1-score of 0.88 on the test set. Benign class results were AUC: 0.984, F1: 0.93, and normal class results were AUC: 0.969, F1: 0.92. The results for the malignant class were AUC: 0.916, F1 score: 0.79. Feature importance analysis showed that ResNet50V2 had the highest contribution to the model’s performance. Conclusion: The proposed approach, combining multi-convolutional neural network deep feature fusion, optimized feature selection, and ensemble stacking, shows significant potential for automated breast ultrasound classification, especially for benign and normal cases. While promising for clinical decision support, the model’s lower sensitivity for malignant lesions necessitates further refinement.
ISSN:2757-5241