Safety analysis in the era of large language models: A case study of STPA using ChatGPT

Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yi Qi, Xingyu Zhao, Siddartha Khastgir, Xiaowei Huang
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827025000052
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576393563602944
author Yi Qi
Xingyu Zhao
Siddartha Khastgir
Xiaowei Huang
author_facet Yi Qi
Xingyu Zhao
Siddartha Khastgir
Xiaowei Huang
author_sort Yi Qi
collection DOAJ
description Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.
format Article
id doaj-art-22b71a80b0a94bd895a1974f94f4d099
institution Kabale University
issn 2666-8270
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Machine Learning with Applications
spelling doaj-art-22b71a80b0a94bd895a1974f94f4d0992025-01-31T05:12:33ZengElsevierMachine Learning with Applications2666-82702025-03-0119100622Safety analysis in the era of large language models: A case study of STPA using ChatGPTYi Qi0Xingyu Zhao1Siddartha Khastgir2Xiaowei Huang3Department of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKWMG, University of Warwick, Coventry, CV4 7AL, UK; Corresponding author.WMG, University of Warwick, Coventry, CV4 7AL, UKDepartment of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKCan safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.http://www.sciencedirect.com/science/article/pii/S2666827025000052STPASafety–critical systemsLarge language modelsSafe AIHuman machine interactionHazards identification
spellingShingle Yi Qi
Xingyu Zhao
Siddartha Khastgir
Xiaowei Huang
Safety analysis in the era of large language models: A case study of STPA using ChatGPT
Machine Learning with Applications
STPA
Safety–critical systems
Large language models
Safe AI
Human machine interaction
Hazards identification
title Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_full Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_fullStr Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_full_unstemmed Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_short Safety analysis in the era of large language models: A case study of STPA using ChatGPT
title_sort safety analysis in the era of large language models a case study of stpa using chatgpt
topic STPA
Safety–critical systems
Large language models
Safe AI
Human machine interaction
Hazards identification
url http://www.sciencedirect.com/science/article/pii/S2666827025000052
work_keys_str_mv AT yiqi safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt
AT xingyuzhao safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt
AT siddarthakhastgir safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt
AT xiaoweihuang safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt