Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition
Accurate identification of surgical instruments is crucial for efficient workflows and patient safety within the operating room, particularly in preventing complications such as retained surgical instruments. Artificial Intelligence (AI) models have shown the potential to automate this process. This...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Bioengineering |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5354/12/1/72 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832589046876995584 |
---|---|
author | Syed Ali Haider Olivia A. Ho Sahar Borna Cesar A. Gomez-Cabello Sophia M. Pressman Dave Cole Ajai Sehgal Bradley C. Leibovich Antonio Jorge Forte |
author_facet | Syed Ali Haider Olivia A. Ho Sahar Borna Cesar A. Gomez-Cabello Sophia M. Pressman Dave Cole Ajai Sehgal Bradley C. Leibovich Antonio Jorge Forte |
author_sort | Syed Ali Haider |
collection | DOAJ |
description | Accurate identification of surgical instruments is crucial for efficient workflows and patient safety within the operating room, particularly in preventing complications such as retained surgical instruments. Artificial Intelligence (AI) models have shown the potential to automate this process. This study evaluates the accuracy of publicly available Large Language Models (LLMs)—ChatGPT-4, ChatGPT-4o, and Gemini—and a specialized commercial mobile application, Surgical-Instrument Directory (SID 2.0), in identifying surgical instruments from images. The study utilized a dataset of 92 high-resolution images of 25 surgical instruments (retractors, forceps, scissors, and trocars) photographed from multiple angles. Model performance was evaluated using accuracy, weighted precision, recall, and F1 score. ChatGPT-4o exhibited the highest accuracy (89.1%) in categorizing instruments (e.g., scissors, forceps). SID 2.0 (77.2%) and ChatGPT-4 (76.1%) achieved comparable accuracy, while Gemini (44.6%) demonstrated lower accuracy in this task. For precise subtype identification of instrument names (like “Mayo scissors” or “Kelly forceps”), all models had low accuracy, with SID 2.0 having an accuracy of 39.1%, followed by ChatGPT-4o (33.69%). Subgroup analysis revealed ChatGPT-4 and 4o recognized trocars in all instances. Similarly, Gemini identified surgical scissors in all instances. In conclusion, publicly available LLMs can reliably identify surgical instruments at the category level, with ChatGPT-4o demonstrating an overall edge. However, precise subtype identification remains a challenge for all models. These findings highlight the potential of AI-driven solutions to enhance surgical-instrument management and underscore the need for further refinements to improve accuracy and support patient safety. |
format | Article |
id | doaj-art-62e6b4dc1340488f86a1ec2d7221c2d8 |
institution | Kabale University |
issn | 2306-5354 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Bioengineering |
spelling | doaj-art-62e6b4dc1340488f86a1ec2d7221c2d82025-01-24T13:23:10ZengMDPI AGBioengineering2306-53542025-01-011217210.3390/bioengineering12010072Use of Multimodal Artificial Intelligence in Surgical Instrument RecognitionSyed Ali Haider0Olivia A. Ho1Sahar Borna2Cesar A. Gomez-Cabello3Sophia M. Pressman4Dave Cole5Ajai Sehgal6Bradley C. Leibovich7Antonio Jorge Forte8Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USADivision of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USADivision of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USADivision of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USADivision of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USACenter for Digital Health, Mayo Clinic, Rochester, MN 55905, USACenter for Digital Health, Mayo Clinic, Rochester, MN 55905, USACenter for Digital Health, Mayo Clinic, Rochester, MN 55905, USADivision of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USAAccurate identification of surgical instruments is crucial for efficient workflows and patient safety within the operating room, particularly in preventing complications such as retained surgical instruments. Artificial Intelligence (AI) models have shown the potential to automate this process. This study evaluates the accuracy of publicly available Large Language Models (LLMs)—ChatGPT-4, ChatGPT-4o, and Gemini—and a specialized commercial mobile application, Surgical-Instrument Directory (SID 2.0), in identifying surgical instruments from images. The study utilized a dataset of 92 high-resolution images of 25 surgical instruments (retractors, forceps, scissors, and trocars) photographed from multiple angles. Model performance was evaluated using accuracy, weighted precision, recall, and F1 score. ChatGPT-4o exhibited the highest accuracy (89.1%) in categorizing instruments (e.g., scissors, forceps). SID 2.0 (77.2%) and ChatGPT-4 (76.1%) achieved comparable accuracy, while Gemini (44.6%) demonstrated lower accuracy in this task. For precise subtype identification of instrument names (like “Mayo scissors” or “Kelly forceps”), all models had low accuracy, with SID 2.0 having an accuracy of 39.1%, followed by ChatGPT-4o (33.69%). Subgroup analysis revealed ChatGPT-4 and 4o recognized trocars in all instances. Similarly, Gemini identified surgical scissors in all instances. In conclusion, publicly available LLMs can reliably identify surgical instruments at the category level, with ChatGPT-4o demonstrating an overall edge. However, precise subtype identification remains a challenge for all models. These findings highlight the potential of AI-driven solutions to enhance surgical-instrument management and underscore the need for further refinements to improve accuracy and support patient safety.https://www.mdpi.com/2306-5354/12/1/72artificial intelligenceAIsurgical instrumentmultimodal AIcomputer vision |
spellingShingle | Syed Ali Haider Olivia A. Ho Sahar Borna Cesar A. Gomez-Cabello Sophia M. Pressman Dave Cole Ajai Sehgal Bradley C. Leibovich Antonio Jorge Forte Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition Bioengineering artificial intelligence AI surgical instrument multimodal AI computer vision |
title | Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition |
title_full | Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition |
title_fullStr | Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition |
title_full_unstemmed | Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition |
title_short | Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition |
title_sort | use of multimodal artificial intelligence in surgical instrument recognition |
topic | artificial intelligence AI surgical instrument multimodal AI computer vision |
url | https://www.mdpi.com/2306-5354/12/1/72 |
work_keys_str_mv | AT syedalihaider useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT oliviaaho useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT saharborna useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT cesaragomezcabello useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT sophiampressman useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT davecole useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT ajaisehgal useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT bradleycleibovich useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition AT antoniojorgeforte useofmultimodalartificialintelligenceinsurgicalinstrumentrecognition |