Safety analysis in the era of large language models: A case study of STPA using ChatGPT

Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yi Qi, Xingyu Zhao, Siddartha Khastgir, Xiaowei Huang
Format:	Article
Language:	English
Published:	Elsevier 2025-03-01
Series:	Machine Learning with Applications
Subjects:	STPA Safety–critical systems Large language models Safe AI Human machine interaction Hazards identification
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666827025000052
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832576393563602944
author	Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang
author_facet	Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang
author_sort	Yi Qi
collection	DOAJ
description	Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.
format	Article
id	doaj-art-22b71a80b0a94bd895a1974f94f4d099
institution	Kabale University
issn	2666-8270
language	English
publishDate	2025-03-01
publisher	Elsevier
record_format	Article
series	Machine Learning with Applications
spelling	doaj-art-22b71a80b0a94bd895a1974f94f4d0992025-01-31T05:12:33ZengElsevierMachine Learning with Applications2666-82702025-03-0119100622Safety analysis in the era of large language models: A case study of STPA using ChatGPTYi Qi0Xingyu Zhao1Siddartha Khastgir2Xiaowei Huang3Department of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKWMG, University of Warwick, Coventry, CV4 7AL, UK; Corresponding author.WMG, University of Warwick, Coventry, CV4 7AL, UKDepartment of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKCan safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.http://www.sciencedirect.com/science/article/pii/S2666827025000052STPASafety–critical systemsLarge language modelsSafe AIHuman machine interactionHazards identification
spellingShingle	Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang Safety analysis in the era of large language models: A case study of STPA using ChatGPT Machine Learning with Applications STPA Safety–critical systems Large language models Safe AI Human machine interaction Hazards identification
title	Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_full	Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_fullStr	Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_full_unstemmed	Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_short	Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_sort	safety analysis in the era of large language models a case study of stpa using chatgpt
topic	STPA Safety–critical systems Large language models Safe AI Human machine interaction Hazards identification
url	http://www.sciencedirect.com/science/article/pii/S2666827025000052
work_keys_str_mv	AT yiqi safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT xingyuzhao safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT siddarthakhastgir safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT xiaoweihuang safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt

Safety analysis in the era of large language models: A case study of STPA using ChatGPT

Similar Items