Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances

The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. Howev...

Full description

Saved in:
Bibliographic Details
Main Authors: Vina Putri Virgiani, Natsuki Ishigaki, Shiro Masuda
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:SICE Journal of Control, Measurement, and System Integration
Subjects:
Online Access:http://dx.doi.org/10.1080/18824889.2024.2357713
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832096620795133952
author Vina Putri Virgiani
Natsuki Ishigaki
Shiro Masuda
author_facet Vina Putri Virgiani
Natsuki Ishigaki
Shiro Masuda
author_sort Vina Putri Virgiani
collection DOAJ
description The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples.
format Article
id doaj-art-cd249dfa95574946bdcd1fdf19e6f89e
institution Kabale University
issn 1884-9970
language English
publishDate 2024-12-01
publisher Taylor & Francis Group
record_format Article
series SICE Journal of Control, Measurement, and System Integration
spelling doaj-art-cd249dfa95574946bdcd1fdf19e6f89e2025-02-05T12:46:15ZengTaylor & Francis GroupSICE Journal of Control, Measurement, and System Integration1884-99702024-12-0117110.1080/18824889.2024.23577132357713Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbancesVina Putri Virgiani0Natsuki Ishigaki1Shiro Masuda2Tokyo Metropolitan UniversityTokyo Metropolitan UniversityTokyo Metropolitan UniversityThe study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples.http://dx.doi.org/10.1080/18824889.2024.2357713adaptive controllinear quadratic optimal controlreinforcement learningpolicy iterationstochastic disturbancesinstrumental variable
spellingShingle Vina Putri Virgiani
Natsuki Ishigaki
Shiro Masuda
Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
SICE Journal of Control, Measurement, and System Integration
adaptive control
linear quadratic optimal control
reinforcement learning
policy iteration
stochastic disturbances
instrumental variable
title Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
title_full Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
title_fullStr Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
title_full_unstemmed Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
title_short Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
title_sort bias free policy evaluation in the discrete time adaptive linear quadratic optimal control in the presence of stochastic disturbances
topic adaptive control
linear quadratic optimal control
reinforcement learning
policy iteration
stochastic disturbances
instrumental variable
url http://dx.doi.org/10.1080/18824889.2024.2357713
work_keys_str_mv AT vinaputrivirgiani biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances
AT natsukiishigaki biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances
AT shiromasuda biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances