Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances

The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. Howev...

Full description

Saved in:
Bibliographic Details
Main Authors: Vina Putri Virgiani, Natsuki Ishigaki, Shiro Masuda
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:SICE Journal of Control, Measurement, and System Integration
Subjects:
Online Access:http://dx.doi.org/10.1080/18824889.2024.2357713
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples.
ISSN:1884-9970