Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning
In this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem i...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Drones |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-446X/9/4/262 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850183471279898624 |
|---|---|
| author | Zhenyu Liu Gang Lei Yong Xian Leliang Ren Shaopeng Li Daqiao Zhang |
| author_facet | Zhenyu Liu Gang Lei Yong Xian Leliang Ren Shaopeng Li Daqiao Zhang |
| author_sort | Zhenyu Liu |
| collection | DOAJ |
| description | In this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem is established. The optimization objective is to maximize the impact velocity; and the constraints for impact time, dive attacking, and FOV are considered simultaneously. The time-to-go estimation method is improved so that it can be applied to high-speed UAVs with time-varying velocity. Then, in order to improve the applicability and robustness of the agent, environmental uncertainties, including aerodynamic parameter errors, observation noise, and target random maneuvers, are incorporated into the training process. Furthermore, inspired by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi>RL</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> algorithm, the recurrent layer is introduced into both the policy and value network. In this way, the agent can automatically adapt to different mission scenarios by updating the hidden states of the recurrent layer. In addition, a compound reward function is designed to train the agent to satisfy the requirements of impact-time control and dive attack simultaneously. Finally, the effectiveness and robustness of the proposed guidance law are validated through numerical simulations conducted across a wide range of scenarios. |
| format | Article |
| id | doaj-art-ce34a22a5a2c4b69b488f47bf145b177 |
| institution | OA Journals |
| issn | 2504-446X |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Drones |
| spelling | doaj-art-ce34a22a5a2c4b69b488f47bf145b1772025-08-20T02:17:20ZengMDPI AGDrones2504-446X2025-03-019426210.3390/drones9040262Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement LearningZhenyu Liu0Gang Lei1Yong Xian2Leliang Ren3Shaopeng Li4Daqiao Zhang5Xi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaXi’an Research Institute of High Technology, Xi’an 710025, ChinaIn this study, an adaptive impact-time-control cooperative guidance law based on deep reinforcement learning considering field-of-view (FOV) constraints is proposed for high-speed UAVs with time-varying velocity. Firstly, a reinforcement learning framework for the high-speed UAVs’ guidance problem is established. The optimization objective is to maximize the impact velocity; and the constraints for impact time, dive attacking, and FOV are considered simultaneously. The time-to-go estimation method is improved so that it can be applied to high-speed UAVs with time-varying velocity. Then, in order to improve the applicability and robustness of the agent, environmental uncertainties, including aerodynamic parameter errors, observation noise, and target random maneuvers, are incorporated into the training process. Furthermore, inspired by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi>RL</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> algorithm, the recurrent layer is introduced into both the policy and value network. In this way, the agent can automatically adapt to different mission scenarios by updating the hidden states of the recurrent layer. In addition, a compound reward function is designed to train the agent to satisfy the requirements of impact-time control and dive attack simultaneously. Finally, the effectiveness and robustness of the proposed guidance law are validated through numerical simulations conducted across a wide range of scenarios.https://www.mdpi.com/2504-446X/9/4/262multiple high-speed UAVscooperative guidanceimpact-time-control guidancereinforcement learningfield-of-view constraints |
| spellingShingle | Zhenyu Liu Gang Lei Yong Xian Leliang Ren Shaopeng Li Daqiao Zhang Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning Drones multiple high-speed UAVs cooperative guidance impact-time-control guidance reinforcement learning field-of-view constraints |
| title | Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning |
| title_full | Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning |
| title_fullStr | Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning |
| title_full_unstemmed | Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning |
| title_short | Adaptive Impact-Time-Control Cooperative Guidance Law for UAVs Under Time-Varying Velocity Based on Reinforcement Learning |
| title_sort | adaptive impact time control cooperative guidance law for uavs under time varying velocity based on reinforcement learning |
| topic | multiple high-speed UAVs cooperative guidance impact-time-control guidance reinforcement learning field-of-view constraints |
| url | https://www.mdpi.com/2504-446X/9/4/262 |
| work_keys_str_mv | AT zhenyuliu adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning AT ganglei adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning AT yongxian adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning AT leliangren adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning AT shaopengli adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning AT daqiaozhang adaptiveimpacttimecontrolcooperativeguidancelawforuavsundertimevaryingvelocitybasedonreinforcementlearning |