The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study
Federated learning (FL) has emerged as a transformative framework for collaborative learning, offering robust model training across institutions while ensuring data privacy. In the context of making a COVID-19 diagnosis using lung imaging, FL enables institutions to collaboratively train a global mo...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Big Data and Cognitive Computing |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-2289/9/1/11 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832589157849890816 |
---|---|
author | Fatimah Alhafiz Abdullah Basuhail |
author_facet | Fatimah Alhafiz Abdullah Basuhail |
author_sort | Fatimah Alhafiz |
collection | DOAJ |
description | Federated learning (FL) has emerged as a transformative framework for collaborative learning, offering robust model training across institutions while ensuring data privacy. In the context of making a COVID-19 diagnosis using lung imaging, FL enables institutions to collaboratively train a global model without sharing sensitive patient data. A central manager aggregates local model updates to compute global updates, ensuring secure and effective integration. The global model’s generalization capability is evaluated using centralized testing data before dissemination to participating nodes, where local assessments facilitate personalized adaptations tailored to diverse datasets. Addressing data heterogeneity, a critical challenge in medical imaging, is essential for improving both global performance and local personalization in FL systems. This study emphasizes the importance of recognizing real-world data variability before proposing solutions to tackle non-independent and non-identically distributed (non-IID) data. We investigate the impact of data heterogeneity on FL performance in COVID-19 lung imaging across seven distinct heterogeneity settings. By comprehensively evaluating models using generalization and personalization metrics, we highlight challenges and opportunities for optimizing FL frameworks. The findings provide valuable insights that can guide future research toward achieving a balance between global generalization and local adaptation, ultimately enhancing diagnostic accuracy and patient outcomes in COVID-19 lung imaging. |
format | Article |
id | doaj-art-fb5cca5e39414d6c853ec58d2a01ff8d |
institution | Kabale University |
issn | 2504-2289 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Big Data and Cognitive Computing |
spelling | doaj-art-fb5cca5e39414d6c853ec58d2a01ff8d2025-01-24T13:22:32ZengMDPI AGBig Data and Cognitive Computing2504-22892025-01-01911110.3390/bdcc9010011The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental StudyFatimah Alhafiz0Abdullah Basuhail1Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaFaculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaFederated learning (FL) has emerged as a transformative framework for collaborative learning, offering robust model training across institutions while ensuring data privacy. In the context of making a COVID-19 diagnosis using lung imaging, FL enables institutions to collaboratively train a global model without sharing sensitive patient data. A central manager aggregates local model updates to compute global updates, ensuring secure and effective integration. The global model’s generalization capability is evaluated using centralized testing data before dissemination to participating nodes, where local assessments facilitate personalized adaptations tailored to diverse datasets. Addressing data heterogeneity, a critical challenge in medical imaging, is essential for improving both global performance and local personalization in FL systems. This study emphasizes the importance of recognizing real-world data variability before proposing solutions to tackle non-independent and non-identically distributed (non-IID) data. We investigate the impact of data heterogeneity on FL performance in COVID-19 lung imaging across seven distinct heterogeneity settings. By comprehensively evaluating models using generalization and personalization metrics, we highlight challenges and opportunities for optimizing FL frameworks. The findings provide valuable insights that can guide future research toward achieving a balance between global generalization and local adaptation, ultimately enhancing diagnostic accuracy and patient outcomes in COVID-19 lung imaging.https://www.mdpi.com/2504-2289/9/1/11federated learningdata heterogeneitynon-IIDglobal modelskew typesgeneralization metric |
spellingShingle | Fatimah Alhafiz Abdullah Basuhail The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study Big Data and Cognitive Computing federated learning data heterogeneity non-IID global model skew types generalization metric |
title | The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study |
title_full | The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study |
title_fullStr | The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study |
title_full_unstemmed | The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study |
title_short | The Data Heterogeneity Issue Regarding COVID-19 Lung Imaging in Federated Learning: An Experimental Study |
title_sort | data heterogeneity issue regarding covid 19 lung imaging in federated learning an experimental study |
topic | federated learning data heterogeneity non-IID global model skew types generalization metric |
url | https://www.mdpi.com/2504-2289/9/1/11 |
work_keys_str_mv | AT fatimahalhafiz thedataheterogeneityissueregardingcovid19lungimaginginfederatedlearninganexperimentalstudy AT abdullahbasuhail thedataheterogeneityissueregardingcovid19lungimaginginfederatedlearninganexperimentalstudy AT fatimahalhafiz dataheterogeneityissueregardingcovid19lungimaginginfederatedlearninganexperimentalstudy AT abdullahbasuhail dataheterogeneityissueregardingcovid19lungimaginginfederatedlearninganexperimentalstudy |