Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM

In autonomous air combat, tactics are inherently complex, and control inputs are continuous. Traditional reinforcement learning (RL) algorithms often rely on discretization or independent Gaussian assumptions, which fail to capture correlations between control variables, limiting the expressiveness...

Full description

Saved in:
Bibliographic Details
Main Authors: Junzhe Jiang, Hongming Wang, Zhixing Huang, Zhuangfeng Zhou, Xiang Wu, Wenqin Deng, Xueyun Chen
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10771757/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850258261558689792
author Junzhe Jiang
Hongming Wang
Zhixing Huang
Zhuangfeng Zhou
Xiang Wu
Wenqin Deng
Xueyun Chen
author_facet Junzhe Jiang
Hongming Wang
Zhixing Huang
Zhuangfeng Zhou
Xiang Wu
Wenqin Deng
Xueyun Chen
author_sort Junzhe Jiang
collection DOAJ
description In autonomous air combat, tactics are inherently complex, and control inputs are continuous. Traditional reinforcement learning (RL) algorithms often rely on discretization or independent Gaussian assumptions, which fail to capture correlations between control variables, limiting the expressiveness of strategies. Moreover, the highly dynamic and complex nature of battlefield scenarios poses significant challenges for conventional neural networks in modeling the long-term evolution of sequential data. To address these challenges, this paper proposes a novel algorithm, T-MGMM, which integrates Transformer networks with a Multivariate Gaussian Mixture Model (MGMM). The self-attention mechanism of Transformers effectively captures dependencies between variables and key situational information. Meanwhile, MGMM utilizes non-diagonal covariance matrices to account for correlations between actions, enhancing action modeling. This synergy ensures precise sequence modeling and flexible decision-making, making T-MGMM particularly well-suited for the complexities of air combat scenarios. To further improve optimization stability, we introduce internal Kullback-Leibler divergence regularization. Experimental results demonstrate that T-MGMM outperforms state-of-the-art algorithms, achieving higher Elo scores within the same training steps, and showcasing superior effectiveness and robustness in air combat decision-making.
format Article
id doaj-art-e3053aaa064042119cf0921231ec81da
institution OA Journals
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e3053aaa064042119cf0921231ec81da2025-08-20T01:56:13ZengIEEEIEEE Access2169-35362024-01-011217850717852210.1109/ACCESS.2024.350921510771757Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMMJunzhe Jiang0https://orcid.org/0009-0004-4686-7520Hongming Wang1https://orcid.org/0009-0003-3507-4261Zhixing Huang2Zhuangfeng Zhou3https://orcid.org/0009-0002-8950-4274Xiang Wu4Wenqin Deng5Xueyun Chen6https://orcid.org/0000-0002-7452-0223School of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaSchool of Electrical Engineering, Guangxi University, Nanning, ChinaIn autonomous air combat, tactics are inherently complex, and control inputs are continuous. Traditional reinforcement learning (RL) algorithms often rely on discretization or independent Gaussian assumptions, which fail to capture correlations between control variables, limiting the expressiveness of strategies. Moreover, the highly dynamic and complex nature of battlefield scenarios poses significant challenges for conventional neural networks in modeling the long-term evolution of sequential data. To address these challenges, this paper proposes a novel algorithm, T-MGMM, which integrates Transformer networks with a Multivariate Gaussian Mixture Model (MGMM). The self-attention mechanism of Transformers effectively captures dependencies between variables and key situational information. Meanwhile, MGMM utilizes non-diagonal covariance matrices to account for correlations between actions, enhancing action modeling. This synergy ensures precise sequence modeling and flexible decision-making, making T-MGMM particularly well-suited for the complexities of air combat scenarios. To further improve optimization stability, we introduce internal Kullback-Leibler divergence regularization. Experimental results demonstrate that T-MGMM outperforms state-of-the-art algorithms, achieving higher Elo scores within the same training steps, and showcasing superior effectiveness and robustness in air combat decision-making.https://ieeexplore.ieee.org/document/10771757/Air combatdeep reinforcement learningmaneuver decision-makingcontinuous action spaceTransformerGaussian mixture model
spellingShingle Junzhe Jiang
Hongming Wang
Zhixing Huang
Zhuangfeng Zhou
Xiang Wu
Wenqin Deng
Xueyun Chen
Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
IEEE Access
Air combat
deep reinforcement learning
maneuver decision-making
continuous action space
Transformer
Gaussian mixture model
title Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
title_full Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
title_fullStr Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
title_full_unstemmed Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
title_short Continuous Action Air Combat Maneuver Decision-Making Based on T-MGMM
title_sort continuous action air combat maneuver decision making based on t mgmm
topic Air combat
deep reinforcement learning
maneuver decision-making
continuous action space
Transformer
Gaussian mixture model
url https://ieeexplore.ieee.org/document/10771757/
work_keys_str_mv AT junzhejiang continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT hongmingwang continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT zhixinghuang continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT zhuangfengzhou continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT xiangwu continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT wenqindeng continuousactionaircombatmaneuverdecisionmakingbasedontmgmm
AT xueyunchen continuousactionaircombatmaneuverdecisionmakingbasedontmgmm