A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

In recent years, Deep Reinforcement Learning (DRL) algorithms have achieved state-of-the-art performance in many challenging strategy games. Because these games have complicated rules, an action sampled from the full discrete action distribution predicted by the learned policy is likely to be invali...

Full description

Saved in:
Bibliographic Details
Main Authors: Shengyi Huang, Santiago Ontañón
Format: Article
Language:English
Published: LibraryPress@UF 2022-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Subjects:
Online Access:https://journals.flvc.org/FLAIRS/article/view/130584
Tags: Add Tag
No Tags, Be the first to tag this record!

Similar Items