Deep Learning Architectures for Single-Label and Multi-Label Surgical Tool Classification in Minimally Invasive Surgeries
The integration of Context-Aware Systems (CASs) in Future Operating Rooms (FORs) aims to enhance surgical workflows and outcomes through real-time data analysis. CASs require accurate classification of surgical tools, enabling the understanding of surgical actions. This study proposes a novel deep l...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/11/6121 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The integration of Context-Aware Systems (CASs) in Future Operating Rooms (FORs) aims to enhance surgical workflows and outcomes through real-time data analysis. CASs require accurate classification of surgical tools, enabling the understanding of surgical actions. This study proposes a novel deep learning approach for surgical tool classification based on combining convolutional neural networks (CNNs), Feature Fusion Modules (FFMs), Squeeze-and-Excitation (SE) networks, and Bidirectional long-short term memory (BiLSTM) networks to capture both spatial and temporal features in laparoscopic surgical videos. We explored different modeling scenarios with respect to the location and number of SE blocks for multi-label surgical tool classification in the Cholec80 dataset. Furthermore, we analyzed a single-label surgical tool classification model using a simplified and computationally less expensive architecture compared to the multi-label problem setting. The single-label classification model showed an improved overall performance compared to the proposed multi-label classification model due to the increased complexity of identifying multiple tools simultaneously. Nonetheless, our results demonstrated that the proposed CNN-SE-FFM-BiLSTM multi-label model achieved competitive performance to state-of-the-art methods with excellent performance in detecting tools with complex usage patterns and in minority classes. Future work should focus on optimizing models for real-time applications, and broadening dataset evaluations to improve performance in diverse surgical environments. These improvements are crucial for the practical implementation of such models in CASs, ultimately aiming to enhance surgical workflows and patient outcomes in FORs. |
|---|---|
| ISSN: | 2076-3417 |