Optimization of State Clustering and Safety Verification in Deep Reinforcement Learning Using KMeans++ and Probabilistic Model Checking

Ensuring the safety of Deep Reinforcement Learning (DRL) systems remains a significant challenge, particularly in real-time applications such as autonomous driving and robotics, where incorrect decisions can lead to catastrophic failures. This study proposes a novel safety verification framework tha...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ryeonggu Kwon, Gihwon Kwon
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Safety verification deep reinforcement learning state clustering discrete-time Markov chain
Online Access:	https://ieeexplore.ieee.org/document/10879317/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Ensuring the safety of Deep Reinforcement Learning (DRL) systems remains a significant challenge, particularly in real-time applications such as autonomous driving and robotics, where incorrect decisions can lead to catastrophic failures. This study proposes a novel safety verification framework that combines state abstraction with probabilistic model checking to quantitatively analyze failure risks. The continuous state space is clustered using the KMeans++ algorithm, enabling efficient state space reduction. A Discrete-Time Markov Chain (DTMC) model is then constructed for each cluster, capturing probabilistic transitions between abstracted states. The PRISM model checker is employed to verify failure probabilities and invariance properties, providing a rigorous quantitative evaluation of system safety. Counterexample analysis identifies critical failure paths, offering actionable insights for policy improvement. Experimental results demonstrate that the optimal number of clusters balances state space reduction with accurate failure analysis, enabling scalable verification. By leveraging model checking within an abstracted state space, this approach enhances the reliability and safety of DRL systems and establishes a pathway for their deployment in safety-critical domains.
ISSN:	2169-3536

Optimization of State Clustering and Safety Verification in Deep Reinforcement Learning Using KMeans++ and Probabilistic Model Checking

Similar Items