In-context learning enables multimodal large language models to classify cancer pathology images
Abstract Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an altern...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2024-11-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-024-51465-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849221064130822144 |
|---|---|
| author | Dyke Ferber Georg Wölflein Isabella C. Wiest Marta Ligero Srividhya Sainath Narmin Ghaffari Laleh Omar S. M. El Nahhas Gustav Müller-Franzes Dirk Jäger Daniel Truhn Jakob Nikolas Kather |
| author_facet | Dyke Ferber Georg Wölflein Isabella C. Wiest Marta Ligero Srividhya Sainath Narmin Ghaffari Laleh Omar S. M. El Nahhas Gustav Müller-Franzes Dirk Jäger Daniel Truhn Jakob Nikolas Kather |
| author_sort | Dyke Ferber |
| collection | DOAJ |
| description | Abstract Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce. |
| format | Article |
| id | doaj-art-3fa7eb85942f4c83b6c9d066cad3cbe7 |
| institution | Kabale University |
| issn | 2041-1723 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-3fa7eb85942f4c83b6c9d066cad3cbe72024-11-24T12:34:47ZengNature PortfolioNature Communications2041-17232024-11-0115111210.1038/s41467-024-51465-9In-context learning enables multimodal large language models to classify cancer pathology imagesDyke Ferber0Georg Wölflein1Isabella C. Wiest2Marta Ligero3Srividhya Sainath4Narmin Ghaffari Laleh5Omar S. M. El Nahhas6Gustav Müller-Franzes7Dirk Jäger8Daniel Truhn9Jakob Nikolas Kather10National Center for Tumor Diseases (NCT), Heidelberg University HospitalSchool of Computer Science, University of St AndrewsElse Kroener Fresenius Center for Digital Health, Technical University DresdenElse Kroener Fresenius Center for Digital Health, Technical University DresdenElse Kroener Fresenius Center for Digital Health, Technical University DresdenElse Kroener Fresenius Center for Digital Health, Technical University DresdenElse Kroener Fresenius Center for Digital Health, Technical University DresdenDepartment of Diagnostic and Interventional Radiology, University Hospital AachenNational Center for Tumor Diseases (NCT), Heidelberg University HospitalDepartment of Diagnostic and Interventional Radiology, University Hospital AachenNational Center for Tumor Diseases (NCT), Heidelberg University HospitalAbstract Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce.https://doi.org/10.1038/s41467-024-51465-9 |
| spellingShingle | Dyke Ferber Georg Wölflein Isabella C. Wiest Marta Ligero Srividhya Sainath Narmin Ghaffari Laleh Omar S. M. El Nahhas Gustav Müller-Franzes Dirk Jäger Daniel Truhn Jakob Nikolas Kather In-context learning enables multimodal large language models to classify cancer pathology images Nature Communications |
| title | In-context learning enables multimodal large language models to classify cancer pathology images |
| title_full | In-context learning enables multimodal large language models to classify cancer pathology images |
| title_fullStr | In-context learning enables multimodal large language models to classify cancer pathology images |
| title_full_unstemmed | In-context learning enables multimodal large language models to classify cancer pathology images |
| title_short | In-context learning enables multimodal large language models to classify cancer pathology images |
| title_sort | in context learning enables multimodal large language models to classify cancer pathology images |
| url | https://doi.org/10.1038/s41467-024-51465-9 |
| work_keys_str_mv | AT dykeferber incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT georgwolflein incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT isabellacwiest incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT martaligero incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT srividhyasainath incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT narminghaffarilaleh incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT omarsmelnahhas incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT gustavmullerfranzes incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT dirkjager incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT danieltruhn incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages AT jakobnikolaskather incontextlearningenablesmultimodallargelanguagemodelstoclassifycancerpathologyimages |