GastroHUN an Endoscopy Dataset of Complete Systematic Screening Protocol for the Stomach

Abstract Endoscopy is vital for detecting and diagnosing gastrointestinal diseases. Systematic examination protocols are key to enhancing detection, particularly for the early identification of premalignant conditions. Publicly available endoscopy image databases are crucial for machine learning res...

Full description

Saved in:
Bibliographic Details
Main Authors: Diego Bravo, Juan Frias, Felipe Vera, Juan Trejos, Carlos Martínez, Martín Gómez, Fabio González, Eduardo Romero
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04401-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Endoscopy is vital for detecting and diagnosing gastrointestinal diseases. Systematic examination protocols are key to enhancing detection, particularly for the early identification of premalignant conditions. Publicly available endoscopy image databases are crucial for machine learning research, yet challenges persist, particularly in identifying upper gastrointestinal anatomical landmarks to ensure effective and precise endoscopic procedures. However, many existing datasets have inconsistent labeling and limited accessibility, leading to biased models and reduced generalizability. This paper introduces GastroHUN, an open dataset documenting stomach screening procedures based on a systematic protocol. GastroHUN includes 8,834 images from 387 patients and 4,729 labeled video sequences, all annotated by four experts. The dataset covers 22 anatomical landmarks in the stomach and includes an additional category for unqualified images, making it a valuable resource for AI model development. By providing a robust public dataset and baseline deep learning models for image and sequence classification, GastroHUN serves as a benchmark for future research and aids in the development of more effective algorithms.
ISSN:2052-4463