Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis
Abstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models a...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-04-01
|
| Series: | BMC Public Health |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12889-025-22705-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849709592392499200 |
|---|---|
| author | Helen Coupland Neil Scheidwasser Alexandros Katsiferis Megan Davies Seth Flaxman Naja Hulvej Rod Swapnil Mishra Samir Bhatt H. Juliette T. Unwin |
| author_facet | Helen Coupland Neil Scheidwasser Alexandros Katsiferis Megan Davies Seth Flaxman Naja Hulvej Rod Swapnil Mishra Samir Bhatt H. Juliette T. Unwin |
| author_sort | Helen Coupland |
| collection | DOAJ |
| description | Abstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models and risk scoring, are limited in their ability to capture the non-linear and temporally dynamic nature of these relationships. Deep learning (DL) and explainable artificial intelligence (XAI) are increasingly applied within healthcare settings to identify influential risk factors and enable personalised interventions. However, significant gaps remain in understanding their utility and limitations, especially for sparse longitudinal life course data and how the influential patterns identified using explainability are linked to underlying causal mechanisms. Methods We conducted a controlled simulation study to assess the performance of various state-of-the-art DL architectures including CNNs and (attention-based) RNNs against XGBoost and logistic regression. Input data was simulated to reflect a generic and generalisable scenario with different rules used to generate multiple realistic outcomes based upon epidemiological concepts. Multiple metrics were used to assess model performance in the presence of class imbalance and SHAP values were calculated. Results We find that DL methods can accurately detect dynamic relationships that baseline linear models and tree-based methods cannot. However, there is no one model that consistently outperforms the others across all scenarios. We further identify the superior performance of DL models in handling sparse feature availability over time compared to traditional machine learning approaches. Additionally, we examine the interpretability provided by SHAP values, demonstrating that these explanations often misalign with causal relationships, despite excellent predictive and calibrative performance. Conclusions These insights provide a foundation for future research applying DL and XAI to life course data, highlighting the challenges associated with sparse healthcare data, and the critical need for advancing interpretability frameworks in personalised public health. |
| format | Article |
| id | doaj-art-9e9b6108d41247adb20ad5d9e2f7732e |
| institution | DOAJ |
| issn | 1471-2458 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Public Health |
| spelling | doaj-art-9e9b6108d41247adb20ad5d9e2f7732e2025-08-20T03:15:14ZengBMCBMC Public Health1471-24582025-04-0125111510.1186/s12889-025-22705-4Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysisHelen Coupland0Neil Scheidwasser1Alexandros Katsiferis2Megan Davies3Seth Flaxman4Naja Hulvej Rod5Swapnil Mishra6Samir Bhatt7H. Juliette T. Unwin8Section of Epidemiology, Department of Public Health, University of CopenhagenSection of Epidemiology, Department of Public Health, University of CopenhagenSection of Epidemiology, Department of Public Health, University of CopenhagenCopenhagen Health Complexity Center, Department of Public Health, University of CopenhagenDepartment of Computer Science, University of OxfordCopenhagen Health Complexity Center, Department of Public Health, University of CopenhagenSaw Swee Hock School of Public Health & Institute of Data Science, National University of SingaporeSection of Epidemiology, Department of Public Health, University of CopenhagenMRC Centre for Global Infectious Disease Analysis, Imperial CollegeAbstract Background Understanding the complex interplay between life course exposures, such as adverse childhood experiences and environmental factors, and disease risk is essential for developing effective public health interventions. Traditional epidemiological methods, such as regression models and risk scoring, are limited in their ability to capture the non-linear and temporally dynamic nature of these relationships. Deep learning (DL) and explainable artificial intelligence (XAI) are increasingly applied within healthcare settings to identify influential risk factors and enable personalised interventions. However, significant gaps remain in understanding their utility and limitations, especially for sparse longitudinal life course data and how the influential patterns identified using explainability are linked to underlying causal mechanisms. Methods We conducted a controlled simulation study to assess the performance of various state-of-the-art DL architectures including CNNs and (attention-based) RNNs against XGBoost and logistic regression. Input data was simulated to reflect a generic and generalisable scenario with different rules used to generate multiple realistic outcomes based upon epidemiological concepts. Multiple metrics were used to assess model performance in the presence of class imbalance and SHAP values were calculated. Results We find that DL methods can accurately detect dynamic relationships that baseline linear models and tree-based methods cannot. However, there is no one model that consistently outperforms the others across all scenarios. We further identify the superior performance of DL models in handling sparse feature availability over time compared to traditional machine learning approaches. Additionally, we examine the interpretability provided by SHAP values, demonstrating that these explanations often misalign with causal relationships, despite excellent predictive and calibrative performance. Conclusions These insights provide a foundation for future research applying DL and XAI to life course data, highlighting the challenges associated with sparse healthcare data, and the critical need for advancing interpretability frameworks in personalised public health.https://doi.org/10.1186/s12889-025-22705-4Deep learningLife course epidemiologyExplainable artificial intelligence |
| spellingShingle | Helen Coupland Neil Scheidwasser Alexandros Katsiferis Megan Davies Seth Flaxman Naja Hulvej Rod Swapnil Mishra Samir Bhatt H. Juliette T. Unwin Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis BMC Public Health Deep learning Life course epidemiology Explainable artificial intelligence |
| title | Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis |
| title_full | Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis |
| title_fullStr | Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis |
| title_full_unstemmed | Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis |
| title_short | Exploring the potential and limitations of deep learning and explainable AI for longitudinal life course analysis |
| title_sort | exploring the potential and limitations of deep learning and explainable ai for longitudinal life course analysis |
| topic | Deep learning Life course epidemiology Explainable artificial intelligence |
| url | https://doi.org/10.1186/s12889-025-22705-4 |
| work_keys_str_mv | AT helencoupland exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT neilscheidwasser exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT alexandroskatsiferis exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT megandavies exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT sethflaxman exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT najahulvejrod exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT swapnilmishra exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT samirbhatt exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis AT hjuliettetunwin exploringthepotentialandlimitationsofdeeplearningandexplainableaiforlongitudinallifecourseanalysis |