Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning
The increasing incidence and mortality of breast cancer pose significant global challenges for women. Deep learning (DL) has shown superior diagnostic performance in breast cancer classification compared to human experts. However, most DL methods have relied on unimodal features, which may limit the...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10818413/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832592857318293504 |
---|---|
author | Sadam Hussain Mansoor Ali Teevno Usman Naseem Daly Betzabeth Avendano Avalos Servando Cardona-Huerta Jose Gerardo Tamez-Pena |
author_facet | Sadam Hussain Mansoor Ali Teevno Usman Naseem Daly Betzabeth Avendano Avalos Servando Cardona-Huerta Jose Gerardo Tamez-Pena |
author_sort | Sadam Hussain |
collection | DOAJ |
description | The increasing incidence and mortality of breast cancer pose significant global challenges for women. Deep learning (DL) has shown superior diagnostic performance in breast cancer classification compared to human experts. However, most DL methods have relied on unimodal features, which may limit the performance of diagnostic models. Recent studies focus on multimodal data along with multiple views of mammograms, typically two: Cranio-Caudal (CC) and Medio-Lateral-Oblique (MLO). Combining multimodal data has shown improvements in classification effectiveness over single-modal systems. In this study, we compiled a multimodal dataset comprising imaging and textual data (combination of clinical and radiological features). We propose a DL-based multiview multimodal feature fusion (MMFF) strategy for breast cancer classification that utilizes images (four views of mammograms) and tabular data (extracted from radiological reports) from our newly developed in-house dataset. Various augmentation techniques are applied to both imaging and textual data to expand the training dataset size. Imaging features were extracted using a Squeeze-and-Excitation (SE) network-based ResNet50 model, while textual features were extracted using an artificial neural network (ANN). Afterwards, extracted features from both modalities were fused using a late feature fusion strategy. Finally, fused features were fed into the ANN for the final classification of breast cancer. In our study, we compared the performance of our proposed MMFF model with single-modal models (image only) and models built on textual data. The performance was evaluated using accuracy, precision, sensitivity, F1 score and area under the receiver operating characteristic curve (AUC) metrics. Our model MMFF achieved an AUC of 0.965 for benign vs malignant classification as compared to image-only (ResNet<inline-formula> <tex-math notation="LaTeX">$50=0.545$ </tex-math></inline-formula>), text-only (ANN =0.688, SVM =0.842) and other multimodal approaches (ResNet50+ANN =0.748, EfficientNetb7+ANN =0.874). |
format | Article |
id | doaj-art-4e0f0e7d2df74e32af64a429b9b9a6fd |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-4e0f0e7d2df74e32af64a429b9b9a6fd2025-01-21T00:02:21ZengIEEEIEEE Access2169-35362025-01-01139265927510.1109/ACCESS.2024.352420310818413Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep LearningSadam Hussain0https://orcid.org/0000-0002-3453-0785Mansoor Ali Teevno1Usman Naseem2https://orcid.org/0000-0003-0191-7171Daly Betzabeth Avendano Avalos3Servando Cardona-Huerta4Jose Gerardo Tamez-Pena5https://orcid.org/0000-0003-1361-5162School of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, MexicoSchool of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, MexicoSchool of Computing, Macquarie University, Sydney, NSW, AustraliaSchool of Medicine and Health Sciences, Tecnológico de Monterrey, Monterrey, MexicoSchool of Medicine and Health Sciences, Tecnológico de Monterrey, Monterrey, MexicoSchool of Medicine and Health Sciences, Tecnológico de Monterrey, Monterrey, MexicoThe increasing incidence and mortality of breast cancer pose significant global challenges for women. Deep learning (DL) has shown superior diagnostic performance in breast cancer classification compared to human experts. However, most DL methods have relied on unimodal features, which may limit the performance of diagnostic models. Recent studies focus on multimodal data along with multiple views of mammograms, typically two: Cranio-Caudal (CC) and Medio-Lateral-Oblique (MLO). Combining multimodal data has shown improvements in classification effectiveness over single-modal systems. In this study, we compiled a multimodal dataset comprising imaging and textual data (combination of clinical and radiological features). We propose a DL-based multiview multimodal feature fusion (MMFF) strategy for breast cancer classification that utilizes images (four views of mammograms) and tabular data (extracted from radiological reports) from our newly developed in-house dataset. Various augmentation techniques are applied to both imaging and textual data to expand the training dataset size. Imaging features were extracted using a Squeeze-and-Excitation (SE) network-based ResNet50 model, while textual features were extracted using an artificial neural network (ANN). Afterwards, extracted features from both modalities were fused using a late feature fusion strategy. Finally, fused features were fed into the ANN for the final classification of breast cancer. In our study, we compared the performance of our proposed MMFF model with single-modal models (image only) and models built on textual data. The performance was evaluated using accuracy, precision, sensitivity, F1 score and area under the receiver operating characteristic curve (AUC) metrics. Our model MMFF achieved an AUC of 0.965 for benign vs malignant classification as compared to image-only (ResNet<inline-formula> <tex-math notation="LaTeX">$50=0.545$ </tex-math></inline-formula>), text-only (ANN =0.688, SVM =0.842) and other multimodal approaches (ResNet50+ANN =0.748, EfficientNetb7+ANN =0.874).https://ieeexplore.ieee.org/document/10818413/Breast cancercomputer-aided diagnosisfeature fusionmultimodal classificationmammogramdeep learning |
spellingShingle | Sadam Hussain Mansoor Ali Teevno Usman Naseem Daly Betzabeth Avendano Avalos Servando Cardona-Huerta Jose Gerardo Tamez-Pena Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning IEEE Access Breast cancer computer-aided diagnosis feature fusion multimodal classification mammogram deep learning |
title | Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning |
title_full | Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning |
title_fullStr | Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning |
title_full_unstemmed | Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning |
title_short | Multiview Multimodal Feature Fusion for Breast Cancer Classification Using Deep Learning |
title_sort | multiview multimodal feature fusion for breast cancer classification using deep learning |
topic | Breast cancer computer-aided diagnosis feature fusion multimodal classification mammogram deep learning |
url | https://ieeexplore.ieee.org/document/10818413/ |
work_keys_str_mv | AT sadamhussain multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning AT mansooraliteevno multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning AT usmannaseem multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning AT dalybetzabethavendanoavalos multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning AT servandocardonahuerta multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning AT josegerardotamezpena multiviewmultimodalfeaturefusionforbreastcancerclassificationusingdeeplearning |