Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis

Abstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models a...

Full description

Saved in:
Bibliographic Details
Main Authors: Helen Coupland, Neil Scheidwasser, Alexandros Katsiferis, Megan Davies, Seth Flaxman, Naja Hulvej Rod, Swapnil Mishra, Samir Bhatt, H. Juliette T. Unwin
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Public Health
Subjects:
Online Access:https://doi.org/10.1186/s12889-025-22705-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849709592392499200
author Helen Coupland
Neil Scheidwasser
Alexandros Katsiferis
Megan Davies
Seth Flaxman
Naja Hulvej Rod
Swapnil Mishra
Samir Bhatt
H. Juliette T. Unwin
author_facet Helen Coupland
Neil Scheidwasser
Alexandros Katsiferis
Megan Davies
Seth Flaxman
Naja Hulvej Rod
Swapnil Mishra
Samir Bhatt
H. Juliette T. Unwin
author_sort Helen Coupland
collection DOAJ
description Abstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models and risk scoring, are limited in their ability to capture the non-linear and temporally dynamic nature of these relationships. Deep learning (DL) and explainable artificial intelligence (XAI) are increasingly applied within healthcare settings to identify influential risk factors and enable personalised interventions. However, significant gaps remain in understanding their utility and limitations, especially for sparse longitudinal life course data and how the influential patterns identified using explainability are linked to underlying causal mechanisms. Methods We conducted a controlled simulation study to assess the performance of various state-of-the-art DL architectures including CNNs and (attention-based) RNNs against XGBoost and logistic regression. Input data was simulated to reflect a generic and generalisable scenario with different rules used to generate multiple realistic outcomes based upon epidemiological concepts. Multiple metrics were used to assess model performance in the presence of class imbalance and SHAP values were calculated. Results We find that DL methods can accurately detect dynamic relationships that baseline linear models and tree-based methods cannot. However, there is no one model that consistently outperforms the others across all scenarios. We further identify the superior performance of DL models in handling sparse feature availability over time compared to traditional machine learning approaches. Additionally, we examine the interpretability provided by SHAP values, demonstrating that these explanations often misalign with causal relationships, despite excellent predictive and calibrative performance. Conclusions These insights provide a foundation for future research applying DL and XAI to life course data, highlighting the challenges associated with sparse healthcare data, and the critical need for advancing interpretability frameworks in personalised public health.
format Article
id doaj-art-9e9b6108d41247adb20ad5d9e2f7732e
institution DOAJ
issn 1471-2458
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series BMC Public Health
spelling doaj-art-9e9b6108d41247adb20ad5d9e2f7732e2025-08-20T03:15:14ZengBMCBMC Public Health1471-24582025-04-0125111510.1186/s12889-025-22705-4Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysisHelen Coupland0Neil Scheidwasser1Alexandros Katsiferis2Megan Davies3Seth Flaxman4Naja Hulvej Rod5Swapnil Mishra6Samir Bhatt7H. Juliette T. Unwin8Section of Epidemiology, Department of Public Health, University of CopenhagenSection of Epidemiology, Department of Public Health, University of CopenhagenSection of Epidemiology, Department of Public Health, University of CopenhagenCopenhagen Health Complexity Center, Department of Public Health, University of CopenhagenDepartment of Computer Science, University of OxfordCopenhagen Health Complexity Center, Department of Public Health, University of CopenhagenSaw Swee Hock School of Public Health & Institute of Data Science, National University of SingaporeSection of Epidemiology, Department of Public Health, University of CopenhagenMRC Centre for Global Infectious Disease Analysis, Imperial CollegeAbstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models and risk scoring, are limited in their ability to capture the non-linear and temporally dynamic nature of these relationships. Deep learning (DL) and explainable artificial intelligence (XAI) are increasingly applied within healthcare settings to identify influential risk factors and enable personalised interventions. However, significant gaps remain in understanding their utility and limitations, especially for sparse longitudinal life course data and how the influential patterns identified using explainability are linked to underlying causal mechanisms. Methods We conducted a controlled simulation study to assess the performance of various state-of-the-art DL architectures including CNNs and (attention-based) RNNs against XGBoost and logistic regression. Input data was simulated to reflect a generic and generalisable scenario with different rules used to generate multiple realistic outcomes based upon epidemiological concepts. Multiple metrics were used to assess model performance in the presence of class imbalance and SHAP values were calculated. Results We find that DL methods can accurately detect dynamic relationships that baseline linear models and tree-based methods cannot. However, there is no one model that consistently outperforms the others across all scenarios. We further identify the superior performance of DL models in handling sparse feature availability over time compared to traditional machine learning approaches. Additionally, we examine the interpretability provided by SHAP values, demonstrating that these explanations often misalign with causal relationships, despite excellent predictive and calibrative performance. Conclusions These insights provide a foundation for future research applying DL and XAI to life course data, highlighting the challenges associated with sparse healthcare data, and the critical need for advancing interpretability frameworks in personalised public health.https://doi.org/10.1186/s12889-025-22705-4Deep learningLife course epidemiologyExplainable artificial intelligence
spellingShingle Helen Coupland
Neil Scheidwasser
Alexandros Katsiferis
Megan Davies
Seth Flaxman
Naja Hulvej Rod
Swapnil Mishra
Samir Bhatt
H. Juliette T. Unwin
Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
BMC Public Health
Deep learning
Life course epidemiology
Explainable artificial intelligence
title Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
title_full Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
title_fullStr Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
title_full_unstemmed Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
title_short Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
title_sort exploring the potential and limitations of deep learning and explainable ai for longitudinal life course analysis
topic Deep learning
Life course epidemiology
Explainable artificial intelligence
url https://doi.org/10.1186/s12889-025-22705-4
work_keys_str_mv AT helencoupland exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT neilscheidwasser exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT alexandroskatsiferis exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT megandavies exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT sethflaxman exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT najahulvejrod exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT swapnilmishra exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT samirbhatt exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis
AT hjuliettetunwin exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis