Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon

The increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, a...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongsheng Xu, Yungui Xu, Ke Wang, Yaping Li, Abdullah Al Ahad
Format: Article
Language:English
Published: Elsevier 2025-07-01
Series:International Journal of Electrical Power & Energy Systems
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0142061525002248
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, attributed to their superior generalization, adaptability, and computational efficiency. However, existing DRL-based LAD methods overlook the discounting effect when calculating the immediate total reward for LAD and lack attention to trial-and-error reward design and expected discounted returns that could reflect the true performance metrics of LAD. Therefore, this paper proposes novel reward shaping (RS)-based DRL algorithms for the rolling-horizon LAD problem. We propose the method for accurately estimating the look-ahead discounted factor that best matches different look-ahead horizons (LAHs). The shaped reward functions are designed and an RS-based regularization is also proposed by employing a potential function. Case studies on the SG 126-bus and IEEE 118-bus systems demonstrate the effectiveness of the proposed improved measures, as well as the superiority and adaptability of the proposed improved DRL algorithms in training and testing performance.© 2017 Elsevier Inc. All rights reserved.
ISSN:0142-0615