Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method

This paper considers the swarm confrontation problem for two teams of unmanned ground vehicles (UGVs). Different from most of the existing works where the two teams are identical, we consider the scenario of two heterogenous teams. In particular, one team has the quantity advantage while the other h...

Full description

Saved in:

Bibliographic Details
Main Authors:	Huanli Gao, Chongming Zhao, Xinghe Yu, Shuangfei Ren, He Cai
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Actuators
Subjects:	swarm confrontation multi-agent reinforcement learning macro states reward mechanism smooth reward unmanned ground vehicle
Online Access:	https://www.mdpi.com/2076-0825/14/1/15
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832589522258362368
author	Huanli Gao Chongming Zhao Xinghe Yu Shuangfei Ren He Cai
author_facet	Huanli Gao Chongming Zhao Xinghe Yu Shuangfei Ren He Cai
author_sort	Huanli Gao
collection	DOAJ
description	This paper considers the swarm confrontation problem for two teams of unmanned ground vehicles (UGVs). Different from most of the existing works where the two teams are identical, we consider the scenario of two heterogenous teams. In particular, one team has the quantity advantage while the other has the resilience advantage. Nevertheless, it is verified by standard tests to show that the overall capabilities of these two heterogenous teams are almost the same. The objective of this article is to design a swarm confrontation algorithm for the team with quantity advantage based on the multi-agent reinforcement learning training method. To address the issue of sparse reward which would result in inefficient learning and poor training performance, a novel macro states reward mechanism based on multi-agent posthumous credit assignment (MSRM-MAPOCA) is proposed in this paper, which together with fine-tuned smooth reward design can fully exploit the advantage in quantity and thus leads to outstanding training performance. Based on the Unity 3D platform, comprehensive direct and indirect comparative tests have been conducted, where the results show that the swarm confrontation algorithm proposed in this article triumphs over other classic or up-to-date swarm confrontation algorithms in terms of both win rate and efficiency.
format	Article
id	doaj-art-15ee579b611743c7ba396fde325d1f80
institution	Kabale University
issn	2076-0825
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Actuators
spelling	doaj-art-15ee579b611743c7ba396fde325d1f802025-01-24T13:15:10ZengMDPI AGActuators2076-08252025-01-011411510.3390/act14010015Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training MethodHuanli Gao0Chongming Zhao1Xinghe Yu2Shuangfei Ren3He Cai4School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, ChinaThis paper considers the swarm confrontation problem for two teams of unmanned ground vehicles (UGVs). Different from most of the existing works where the two teams are identical, we consider the scenario of two heterogenous teams. In particular, one team has the quantity advantage while the other has the resilience advantage. Nevertheless, it is verified by standard tests to show that the overall capabilities of these two heterogenous teams are almost the same. The objective of this article is to design a swarm confrontation algorithm for the team with quantity advantage based on the multi-agent reinforcement learning training method. To address the issue of sparse reward which would result in inefficient learning and poor training performance, a novel macro states reward mechanism based on multi-agent posthumous credit assignment (MSRM-MAPOCA) is proposed in this paper, which together with fine-tuned smooth reward design can fully exploit the advantage in quantity and thus leads to outstanding training performance. Based on the Unity 3D platform, comprehensive direct and indirect comparative tests have been conducted, where the results show that the swarm confrontation algorithm proposed in this article triumphs over other classic or up-to-date swarm confrontation algorithms in terms of both win rate and efficiency.https://www.mdpi.com/2076-0825/14/1/15swarm confrontationmulti-agent reinforcement learningmacro states reward mechanismsmooth rewardunmanned ground vehicle
spellingShingle	Huanli Gao Chongming Zhao Xinghe Yu Shuangfei Ren He Cai Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method Actuators swarm confrontation multi-agent reinforcement learning macro states reward mechanism smooth reward unmanned ground vehicle
title	Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
title_full	Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
title_fullStr	Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
title_full_unstemmed	Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
title_short	Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
title_sort	swarm confrontation algorithm for ugv swarm with quantity advantage by a novel msrm mapoca training method
topic	swarm confrontation multi-agent reinforcement learning macro states reward mechanism smooth reward unmanned ground vehicle
url	https://www.mdpi.com/2076-0825/14/1/15
work_keys_str_mv	AT huanligao swarmconfrontationalgorithmforugvswarmwithquantityadvantagebyanovelmsrmmapocatrainingmethod AT chongmingzhao swarmconfrontationalgorithmforugvswarmwithquantityadvantagebyanovelmsrmmapocatrainingmethod AT xingheyu swarmconfrontationalgorithmforugvswarmwithquantityadvantagebyanovelmsrmmapocatrainingmethod AT shuangfeiren swarmconfrontationalgorithmforugvswarmwithquantityadvantagebyanovelmsrmmapocatrainingmethod AT hecai swarmconfrontationalgorithmforugvswarmwithquantityadvantagebyanovelmsrmmapocatrainingmethod

Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method

Similar Items