Tactical intent-driven autonomous air combat behavior generation method

Abstract With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xingyu Wang, Zhen Yang, Shiyuan Chai, Jichuan Huang, Yupeng He, Deyun Zhou
Format:	Article
Language:	English
Published:	Springer 2024-12-01
Series:	Complex & Intelligent Systems
Subjects:	Tactical intent Behavioural strategies Reward design Deep reinforcement learning
Online Access:	https://doi.org/10.1007/s40747-024-01685-9
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832571141050335232
author	Xingyu Wang Zhen Yang Shiyuan Chai Jichuan Huang Yupeng He Deyun Zhou
author_facet	Xingyu Wang Zhen Yang Shiyuan Chai Jichuan Huang Yupeng He Deyun Zhou
author_sort	Xingyu Wang
collection	DOAJ
description	Abstract With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.
format	Article
id	doaj-art-01c5579d8e0f4bd5ad0486e01755532c
institution	Kabale University
issn	2199-4536 2198-6053
language	English
publishDate	2024-12-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-01c5579d8e0f4bd5ad0486e01755532c2025-02-02T12:49:39ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-12-0111112210.1007/s40747-024-01685-9Tactical intent-driven autonomous air combat behavior generation methodXingyu Wang0Zhen Yang1Shiyuan Chai2Jichuan Huang3Yupeng He4Deyun Zhou5School of Electronics and Information, Northwestern Polytechnical UniversitySchool of Electronics and Information, Northwestern Polytechnical UniversitySchool of Electronics and Information, Northwestern Polytechnical UniversitySchool of Electronics and Information, Northwestern Polytechnical UniversitySchool of Electronics and Information, Northwestern Polytechnical UniversitySchool of Electronics and Information, Northwestern Polytechnical UniversityAbstract With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.https://doi.org/10.1007/s40747-024-01685-9Tactical intentBehavioural strategiesReward designDeep reinforcement learning
spellingShingle	Xingyu Wang Zhen Yang Shiyuan Chai Jichuan Huang Yupeng He Deyun Zhou Tactical intent-driven autonomous air combat behavior generation method Complex & Intelligent Systems Tactical intent Behavioural strategies Reward design Deep reinforcement learning
title	Tactical intent-driven autonomous air combat behavior generation method
title_full	Tactical intent-driven autonomous air combat behavior generation method
title_fullStr	Tactical intent-driven autonomous air combat behavior generation method
title_full_unstemmed	Tactical intent-driven autonomous air combat behavior generation method
title_short	Tactical intent-driven autonomous air combat behavior generation method
title_sort	tactical intent driven autonomous air combat behavior generation method
topic	Tactical intent Behavioural strategies Reward design Deep reinforcement learning
url	https://doi.org/10.1007/s40747-024-01685-9
work_keys_str_mv	AT xingyuwang tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod AT zhenyang tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod AT shiyuanchai tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod AT jichuanhuang tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod AT yupenghe tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod AT deyunzhou tacticalintentdrivenautonomousaircombatbehaviorgenerationmethod

Tactical intent-driven autonomous air combat behavior generation method

Similar Items