Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning

In this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem i...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenyu Liu, Gang Lei, Yong Xian, Leliang Ren, Shaopeng Li, Daqiao Zhang
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/4/262
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850183471279898624
author Zhenyu Liu
Gang Lei
Yong Xian
Leliang Ren
Shaopeng Li
Daqiao Zhang
author_facet Zhenyu Liu
Gang Lei
Yong Xian
Leliang Ren
Shaopeng Li
Daqiao Zhang
author_sort Zhenyu Liu
collection DOAJ
description In this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem is established. The optimization objective is to maximize the impact velocity; and the constraints for impact time, dive attacking, and FOV are considered simultaneously. The time-to-go estimation method is improved so that it can be applied to high-speed UAVs with time-varying velocity. Then, in order to improve the applicability and robustness of the agent, environmental uncertainties, including aerodynamic parameter errors, observation noise, and target random maneuvers, are incorporated into the training process. Furthermore, inspired by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi>RL</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> algorithm, the recurrent layer is introduced into both the policy and value network. In this way, the agent can automatically adapt to different mission scenarios by updating the hidden states of the recurrent layer. In addition, a compound reward function is designed to train the agent to satisfy the requirements of impact-time control and dive attack simultaneously. Finally, the effectiveness and robustness of the proposed guidance law are validated through numerical simulations conducted across a wide range of scenarios.
format Article
id doaj-art-ce34a22a5a2c4b69b488f47bf145b177
institution OA Journals
issn 2504-446X
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-ce34a22a5a2c4b69b488f47bf145b1772025-08-20T02:17:20ZengMDPI AGDrones2504-446X2025-03-019426210.3390/drones9040262Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement LearningZhenyu Liu0Gang Lei1Yong Xian2Leliang Ren3Shaopeng Li4Daqiao Zhang5Xi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaIn this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem is established. The optimization objective is to maximize the impact velocity; and the constraints for impact time, dive attacking, and FOV are considered simultaneously. The time-to-go estimation method is improved so that it can be applied to high-speed UAVs with time-varying velocity. Then, in order to improve the applicability and robustness of the agent, environmental uncertainties, including aerodynamic parameter errors, observation noise, and target random maneuvers, are incorporated into the training process. Furthermore, inspired by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi>RL</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> algorithm, the recurrent layer is introduced into both the policy and value network. In this way, the agent can automatically adapt to different mission scenarios by updating the hidden states of the recurrent layer. In addition, a compound reward function is designed to train the agent to satisfy the requirements of impact-time control and dive attack simultaneously. Finally, the effectiveness and robustness of the proposed guidance law are validated through numerical simulations conducted across a wide range of scenarios.https://www.mdpi.com/2504-446X/9/4/262multiple high-speed UAVscooperative guidanceimpact-time-control guidancereinforcement learningfield-of-view constraints
spellingShingle Zhenyu Liu
Gang Lei
Yong Xian
Leliang Ren
Shaopeng Li
Daqiao Zhang
Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
Drones
multiple high-speed UAVs
cooperative guidance
impact-time-control guidance
reinforcement learning
field-of-view constraints
title Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
title_full Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
title_fullStr Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
title_full_unstemmed Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
title_short Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
title_sort adaptive impact time control cooperative guidance law for uavs under time varying velocity based on reinforcement learning
topic multiple high-speed UAVs
cooperative guidance
impact-time-control guidance
reinforcement learning
field-of-view constraints
url https://www.mdpi.com/2504-446X/9/4/262
work_keys_str_mv AT zhenyuliu adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning
AT ganglei adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning
AT yongxian adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning
AT leliangren adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning
AT shaopengli adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning
AT daqiaozhang adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning