Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology
Abstract Large language models generate plausible text responses to medical questions, but inaccurate responses pose significant risks in medical decision-making. Grading LLM outputs to determine the best model or answer is time-consuming and impractical in clinical settings; therefore, we introduce...
Saved in:
| Main Authors: | Mauro Giuffrè, Kisung You, Ziteng Pang, Simone Kresevic, Sunny Chung, Ryan Chen, Youngmin Ko, Colleen Chan, Theo Saarinen, Milos Ajcevic, Lory S. Crocè, Guadalupe Garcia-Tsao, Ian Gralnek, Joseph J. Y. Sung, Alan Barkun, Loren Laine, Jasjeet Sekhon, Bradly Stadie, Dennis L. Shung |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-025-01589-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Usability and adoption in a randomized trial of GutGPT a GenAI tool for gastrointestinal bleeding
by: Sunny Chung, et al.
Published: (2025-08-01) -
PolEval 2022/23 Challenge Tasks and Results
by: Łukasz Kobyliński, et al.
Published: (2023-09-01) -
REVISITING THE CONTENT OF THE TERMS «EXPERT» AND «FORENSIC EXPERT»
by: Lada F. Paramonova
Published: (2018-03-01) -
EvalRound+ Bootstrapping and Its Rigorous Analysis for CKKS Scheme
by: Hyewon Sung, et al.
Published: (2025-01-01) -
Safety Design and Evalation on Traction System of Mass Transit Vehicles
by: JIANG Yue-li, et al.
Published: (2013-01-01)