Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method

This paper considers the swarm confrontation problem for two teams of unmanned ground vehicles (UGVs). Different from most of the existing works where the two teams are identical, we consider the scenario of two heterogenous teams. In particular, one team has the quantity advantage while the other h...

Full description

Saved in:
Bibliographic Details
Main Authors: Huanli Gao, Chongming Zhao, Xinghe Yu, Shuangfei Ren, He Cai
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Actuators
Subjects:
Online Access:https://www.mdpi.com/2076-0825/14/1/15
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper considers the swarm confrontation problem for two teams of unmanned ground vehicles (UGVs). Different from most of the existing works where the two teams are identical, we consider the scenario of two heterogenous teams. In particular, one team has the quantity advantage while the other has the resilience advantage. Nevertheless, it is verified by standard tests to show that the overall capabilities of these two heterogenous teams are almost the same. The objective of this article is to design a swarm confrontation algorithm for the team with quantity advantage based on the multi-agent reinforcement learning training method. To address the issue of sparse reward which would result in inefficient learning and poor training performance, a novel macro states reward mechanism based on multi-agent posthumous credit assignment (MSRM-MAPOCA) is proposed in this paper, which together with fine-tuned smooth reward design can fully exploit the advantage in quantity and thus leads to outstanding training performance. Based on the Unity 3D platform, comprehensive direct and indirect comparative tests have been conducted, where the results show that the swarm confrontation algorithm proposed in this article triumphs over other classic or up-to-date swarm confrontation algorithms in terms of both win rate and efficiency.
ISSN:2076-0825