Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method

As Unmanned Aerial Vehicle (UAV) technology advances, UAVs have attracted widespread attention across military and civilian fields due to their low cost and flexibility. In unknown environments, UAVs can significantly reduce the risk of casualties and improve the safety and covertness when performin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Heng Xu, Dayong Zhu
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Drones
Subjects:	deep q-network partially observable unmanned aerial vehicle multi-agent target search
Online Access:	https://www.mdpi.com/2504-446X/9/1/74
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832588610888531968
author	Heng Xu Dayong Zhu
author_facet	Heng Xu Dayong Zhu
author_sort	Heng Xu
collection	DOAJ
description	As Unmanned Aerial Vehicle (UAV) technology advances, UAVs have attracted widespread attention across military and civilian fields due to their low cost and flexibility. In unknown environments, UAVs can significantly reduce the risk of casualties and improve the safety and covertness when performing missions. Reinforcement Learning allows agents to learn optimal policies through trials in the environment, enabling UAVs to respond autonomously according to the real-time conditions. Due to the limitation of the observation range of UAV sensors, UAV target search missions face the challenge of partial observation. Based on this, Partially Observable Deep Q-Network (PODQN), which is a DQN-based algorithm is proposed. The PODQN algorithm utilizes the Gated Recurrent Unit (GRU) to remember the past observation information. It integrates the target network and decomposes the action value for better evaluation. In addition, the artificial potential field is introduced to solve the potential collision problem. The simulation environment for UAV target search is constructed through the custom Markov Decision Process. By comparing the PODQN algorithm with random strategy, DQN, Double DQN, Dueling DQN, VDN, QMIX, it is demonstrated that the proposed PODQN algorithm has the best performance under different agent configurations.
format	Article
id	doaj-art-2648dbead936410a80c2f034e2dfddcb
institution	Kabale University
issn	2504-446X
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Drones
spelling	doaj-art-2648dbead936410a80c2f034e2dfddcb2025-01-24T13:29:53ZengMDPI AGDrones2504-446X2025-01-01917410.3390/drones9010074Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable MethodHeng Xu0Dayong Zhu1School of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2, North Jianshe Road, Chengdu 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2, North Jianshe Road, Chengdu 610054, ChinaAs Unmanned Aerial Vehicle (UAV) technology advances, UAVs have attracted widespread attention across military and civilian fields due to their low cost and flexibility. In unknown environments, UAVs can significantly reduce the risk of casualties and improve the safety and covertness when performing missions. Reinforcement Learning allows agents to learn optimal policies through trials in the environment, enabling UAVs to respond autonomously according to the real-time conditions. Due to the limitation of the observation range of UAV sensors, UAV target search missions face the challenge of partial observation. Based on this, Partially Observable Deep Q-Network (PODQN), which is a DQN-based algorithm is proposed. The PODQN algorithm utilizes the Gated Recurrent Unit (GRU) to remember the past observation information. It integrates the target network and decomposes the action value for better evaluation. In addition, the artificial potential field is introduced to solve the potential collision problem. The simulation environment for UAV target search is constructed through the custom Markov Decision Process. By comparing the PODQN algorithm with random strategy, DQN, Double DQN, Dueling DQN, VDN, QMIX, it is demonstrated that the proposed PODQN algorithm has the best performance under different agent configurations.https://www.mdpi.com/2504-446X/9/1/74deep q-networkpartially observableunmanned aerial vehiclemulti-agenttarget search
spellingShingle	Heng Xu Dayong Zhu Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method Drones deep q-network partially observable unmanned aerial vehicle multi-agent target search
title	Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method
title_full	Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method
title_fullStr	Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method
title_full_unstemmed	Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method
title_short	Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method
title_sort	multiple unmanned aerial vehicle collaborative target search by drl a dqn based multi agent partially observable method
topic	deep q-network partially observable unmanned aerial vehicle multi-agent target search
url	https://www.mdpi.com/2504-446X/9/1/74
work_keys_str_mv	AT hengxu multipleunmannedaerialvehiclecollaborativetargetsearchbydrladqnbasedmultiagentpartiallyobservablemethod AT dayongzhu multipleunmannedaerialvehiclecollaborativetargetsearchbydrladqnbasedmultiagentpartiallyobservablemethod

Multiple Unmanned Aerial Vehicle Collaborative Target Search by DRL: A DQN-Based Multi-Agent Partially Observable Method

Similar Items