Safety analysis in the era of large language models: A case study of STPA using ChatGPT
Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investi...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-03-01
|
Series: | Machine Learning with Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666827025000052 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832576393563602944 |
---|---|
author | Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang |
author_facet | Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang |
author_sort | Yi Qi |
collection | DOAJ |
description | Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible. |
format | Article |
id | doaj-art-22b71a80b0a94bd895a1974f94f4d099 |
institution | Kabale University |
issn | 2666-8270 |
language | English |
publishDate | 2025-03-01 |
publisher | Elsevier |
record_format | Article |
series | Machine Learning with Applications |
spelling | doaj-art-22b71a80b0a94bd895a1974f94f4d0992025-01-31T05:12:33ZengElsevierMachine Learning with Applications2666-82702025-03-0119100622Safety analysis in the era of large language models: A case study of STPA using ChatGPTYi Qi0Xingyu Zhao1Siddartha Khastgir2Xiaowei Huang3Department of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKWMG, University of Warwick, Coventry, CV4 7AL, UK; Corresponding author.WMG, University of Warwick, Coventry, CV4 7AL, UKDepartment of Computer Science, University of Liverpool, Liverpool, L69 3BX, UKCan safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.http://www.sciencedirect.com/science/article/pii/S2666827025000052STPASafety–critical systemsLarge language modelsSafe AIHuman machine interactionHazards identification |
spellingShingle | Yi Qi Xingyu Zhao Siddartha Khastgir Xiaowei Huang Safety analysis in the era of large language models: A case study of STPA using ChatGPT Machine Learning with Applications STPA Safety–critical systems Large language models Safe AI Human machine interaction Hazards identification |
title | Safety analysis in the era of large language models: A case study of STPA using ChatGPT |
title_full | Safety analysis in the era of large language models: A case study of STPA using ChatGPT |
title_fullStr | Safety analysis in the era of large language models: A case study of STPA using ChatGPT |
title_full_unstemmed | Safety analysis in the era of large language models: A case study of STPA using ChatGPT |
title_short | Safety analysis in the era of large language models: A case study of STPA using ChatGPT |
title_sort | safety analysis in the era of large language models a case study of stpa using chatgpt |
topic | STPA Safety–critical systems Large language models Safe AI Human machine interaction Hazards identification |
url | http://www.sciencedirect.com/science/article/pii/S2666827025000052 |
work_keys_str_mv | AT yiqi safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT xingyuzhao safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT siddarthakhastgir safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt AT xiaoweihuang safetyanalysisintheeraoflargelanguagemodelsacasestudyofstpausingchatgpt |