Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. Howev...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2024-12-01
|
Series: | SICE Journal of Control, Measurement, and System Integration |
Subjects: | |
Online Access: | http://dx.doi.org/10.1080/18824889.2024.2357713 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832096620795133952 |
---|---|
author | Vina Putri Virgiani Natsuki Ishigaki Shiro Masuda |
author_facet | Vina Putri Virgiani Natsuki Ishigaki Shiro Masuda |
author_sort | Vina Putri Virgiani |
collection | DOAJ |
description | The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples. |
format | Article |
id | doaj-art-cd249dfa95574946bdcd1fdf19e6f89e |
institution | Kabale University |
issn | 1884-9970 |
language | English |
publishDate | 2024-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | SICE Journal of Control, Measurement, and System Integration |
spelling | doaj-art-cd249dfa95574946bdcd1fdf19e6f89e2025-02-05T12:46:15ZengTaylor & Francis GroupSICE Journal of Control, Measurement, and System Integration1884-99702024-12-0117110.1080/18824889.2024.23577132357713Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbancesVina Putri Virgiani0Natsuki Ishigaki1Shiro Masuda2Tokyo Metropolitan UniversityTokyo Metropolitan UniversityTokyo Metropolitan UniversityThe study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples.http://dx.doi.org/10.1080/18824889.2024.2357713adaptive controllinear quadratic optimal controlreinforcement learningpolicy iterationstochastic disturbancesinstrumental variable |
spellingShingle | Vina Putri Virgiani Natsuki Ishigaki Shiro Masuda Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances SICE Journal of Control, Measurement, and System Integration adaptive control linear quadratic optimal control reinforcement learning policy iteration stochastic disturbances instrumental variable |
title | Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
title_full | Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
title_fullStr | Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
title_full_unstemmed | Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
title_short | Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
title_sort | bias free policy evaluation in the discrete time adaptive linear quadratic optimal control in the presence of stochastic disturbances |
topic | adaptive control linear quadratic optimal control reinforcement learning policy iteration stochastic disturbances instrumental variable |
url | http://dx.doi.org/10.1080/18824889.2024.2357713 |
work_keys_str_mv | AT vinaputrivirgiani biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances AT natsukiishigaki biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances AT shiromasuda biasfreepolicyevaluationinthediscretetimeadaptivelinearquadraticoptimalcontrolinthepresenceofstochasticdisturbances |