Heterogeneous Multi-Agent Deep Reinforcement Learning for Cluster-Based Spectrum Sharing in UAV Swarms
Unmanned aerial vehicle (UAV) swarms are widely applied in various fields, including military and civilian domains. However, due to the scarcity of spectrum resources, UAV swarm clustering technology has emerged as an effective method for achieving spectrum sharing among UAV swarms. This paper intro...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Drones |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-446X/9/5/377 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Unmanned aerial vehicle (UAV) swarms are widely applied in various fields, including military and civilian domains. However, due to the scarcity of spectrum resources, UAV swarm clustering technology has emerged as an effective method for achieving spectrum sharing among UAV swarms. This paper introduces a distributed heterogeneous multi-agent deep reinforcement learning algorithm, named HMDRL-UC, which is specifically designed to address the cluster-based spectrum sharing problem in heterogeneous UAV swarms. Heterogeneous UAV swarms consist of two types of UAVs: cluster head (CH) and cluster member (CM). Each UAV is equipped with an intelligent agent to execute the deep reinforcement learning (DRL) algorithm. Correspondingly, the HMDRL-UC consists of two parts: multi-agent proximal policy optimization for cluster head (MAPPO-H) and independent proximal policy optimization for cluster member (IPPO-M). The MAPPO-H enables the CHs to decide cluster selection and moving position, while CMs utilize IPPO-M to cluster autonomously under the condition of certain partial channel distribution information (CDI). Adequate experimental evidence has confirmed that the HMDRL-UC algorithm proposed in this paper is not only capable of managing dynamic drone swarm scenarios in the presence of partial CDI, but also has a clear advantage over the other existing three algorithms in terms of average throughput, intra-cluster communication delay, and minimum signal-to-noise ratio (SNR). |
|---|---|
| ISSN: | 2504-446X |