Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods

PurposeThis study aims to classify Kellgren–Lawrence (KL) osteoarthritis stages using knee anteroposterior X-ray images by comparing two deep learning (DL) methodologies: a traditional single-model approach and a proposed multi-model approach. We addressed three core research questions in this study...

Full description

Saved in:
Bibliographic Details
Main Authors: Sahika Betul Yayli, Kutay Kılıç, Salih Beyaz
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2025.1413820/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832539948769607680
author Sahika Betul Yayli
Kutay Kılıç
Salih Beyaz
author_facet Sahika Betul Yayli
Kutay Kılıç
Salih Beyaz
author_sort Sahika Betul Yayli
collection DOAJ
description PurposeThis study aims to classify Kellgren–Lawrence (KL) osteoarthritis stages using knee anteroposterior X-ray images by comparing two deep learning (DL) methodologies: a traditional single-model approach and a proposed multi-model approach. We addressed three core research questions in this study: (1) How effective are single-model and multi-model deep learning approaches in classifying KL stages? (2) How do seven convolutional neural network (CNN) architectures perform across four distinct deep learning tasks? (3) What is the impact of CLAHE (Contrast Limited Adaptive Histogram Equalization) on classification performance?ApproachWe created a dataset of 14,607 annotated knee AP X-rays from three hospitals. The knee joint region was isolated using a YOLOv5 object detection model. The multi-model approach utilized three DL models: one for osteophyte detection, another for joint space narrowing analysis, and a third to combine these outputs with demographic and image data for KL classification. The single-model approach directly classified KL stages as a benchmark. Seven CNN architectures (NfNet-F0/F1, EfficientNet-B0/B3, Inception-ResNet-v2, VGG16) were trained with and without CLAHE augmentation.ResultsThe single-model approach achieved an F1-score of 0.763 and accuracy of 0.767, outperforming the multi-model strategy, which scored 0.736 and 0.740. Different models performed best across tasks, underscoring the need for task-specific architecture selection. CLAHE negatively impacted most models, with only one showing a marginal improvement of 0.3%.ConclusionThe single-model approach was more effective for KL grading, surpassing metrics in existing literature. These findings emphasize the importance of task-specific architectures and preprocessing. Future studies should explore ensemble modeling, advanced augmentations, and clinical validation to enhance applicability.
format Article
id doaj-art-1007e528f0604af6998a3b34c8e957bc
institution Kabale University
issn 2624-8212
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj-art-1007e528f0604af6998a3b34c8e957bc2025-02-05T07:31:47ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-02-01810.3389/frai.2025.14138201413820Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methodsSahika Betul Yayli0Kutay Kılıç1Salih Beyaz2Artificial Intelligence and Digital Analytics Solutions, Turkcell Technology, Istanbul, TürkiyeArtificial Intelligence and Digital Analytics Solutions, Turkcell Technology, Istanbul, TürkiyeOrthopedics and Traumatology Department, Adana Turgut Noyan Research and Training Centre, Baskent University, Adana, TürkiyePurposeThis study aims to classify Kellgren–Lawrence (KL) osteoarthritis stages using knee anteroposterior X-ray images by comparing two deep learning (DL) methodologies: a traditional single-model approach and a proposed multi-model approach. We addressed three core research questions in this study: (1) How effective are single-model and multi-model deep learning approaches in classifying KL stages? (2) How do seven convolutional neural network (CNN) architectures perform across four distinct deep learning tasks? (3) What is the impact of CLAHE (Contrast Limited Adaptive Histogram Equalization) on classification performance?ApproachWe created a dataset of 14,607 annotated knee AP X-rays from three hospitals. The knee joint region was isolated using a YOLOv5 object detection model. The multi-model approach utilized three DL models: one for osteophyte detection, another for joint space narrowing analysis, and a third to combine these outputs with demographic and image data for KL classification. The single-model approach directly classified KL stages as a benchmark. Seven CNN architectures (NfNet-F0/F1, EfficientNet-B0/B3, Inception-ResNet-v2, VGG16) were trained with and without CLAHE augmentation.ResultsThe single-model approach achieved an F1-score of 0.763 and accuracy of 0.767, outperforming the multi-model strategy, which scored 0.736 and 0.740. Different models performed best across tasks, underscoring the need for task-specific architecture selection. CLAHE negatively impacted most models, with only one showing a marginal improvement of 0.3%.ConclusionThe single-model approach was more effective for KL grading, surpassing metrics in existing literature. These findings emphasize the importance of task-specific architectures and preprocessing. Future studies should explore ensemble modeling, advanced augmentations, and clinical validation to enhance applicability.https://www.frontiersin.org/articles/10.3389/frai.2025.1413820/fullartificial intelligencedeep learningtransfer learningKellgren–Lawrencegonarthrosismedical imaging
spellingShingle Sahika Betul Yayli
Kutay Kılıç
Salih Beyaz
Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
Frontiers in Artificial Intelligence
artificial intelligence
deep learning
transfer learning
Kellgren–Lawrence
gonarthrosis
medical imaging
title Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
title_full Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
title_fullStr Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
title_full_unstemmed Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
title_short Deep learning in gonarthrosis classification: a comparative study of model architectures and single vs. multi-model methods
title_sort deep learning in gonarthrosis classification a comparative study of model architectures and single vs multi model methods
topic artificial intelligence
deep learning
transfer learning
Kellgren–Lawrence
gonarthrosis
medical imaging
url https://www.frontiersin.org/articles/10.3389/frai.2025.1413820/full
work_keys_str_mv AT sahikabetulyayli deeplearningingonarthrosisclassificationacomparativestudyofmodelarchitecturesandsinglevsmultimodelmethods
AT kutaykılıc deeplearningingonarthrosisclassificationacomparativestudyofmodelarchitecturesandsinglevsmultimodelmethods
AT salihbeyaz deeplearningingonarthrosisclassificationacomparativestudyofmodelarchitecturesandsinglevsmultimodelmethods