Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase

To enhance the penetration capability of hypersonic vehicles in the gliding phase, an intelligent maneuvering penetration strategy combining imitation learning and reinforcement learning is proposed. Firstly, a reinforcement learning penetration model for hypersonic vehicles is established based on...

Full description

Saved in:
Bibliographic Details
Main Authors: Lei Xu, Yingzi Guan, Jialun Pu, Changzhu Wei
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Aerospace
Subjects:
Online Access:https://www.mdpi.com/2226-4310/12/5/438
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849711015507263488
author Lei Xu
Yingzi Guan
Jialun Pu
Changzhu Wei
author_facet Lei Xu
Yingzi Guan
Jialun Pu
Changzhu Wei
author_sort Lei Xu
collection DOAJ
description To enhance the penetration capability of hypersonic vehicles in the gliding phase, an intelligent maneuvering penetration strategy combining imitation learning and reinforcement learning is proposed. Firstly, a reinforcement learning penetration model for hypersonic vehicles is established based on the Markov Decision Process (MDP), with the design of state, action spaces, and composite reward function based on Zero-Effort Miss (ZEM). Furthermore, to overcome the difficulties in training reinforcement learning models, a truncated horizon method is employed to integrate reinforcement learning with imitation learning at the level of the optimization target. This results in the construction of a Truncated Horizon Imitation Learning Soft Actor–Critic (THIL-SAC) intelligent penetration strategy learning model, enabling a smooth transition from imitation to exploration. Finally, reward shaping and expert policies are introduced to enhance the training process. Simulation results demonstrate that the THIL-SAC strategy achieves faster convergence compared to the standard SAC method and outperforms expert strategies. Additionally, the THIL-SAC strategy meets real-time requirements for high-speed penetration scenarios, offering improved adaptability and penetration performance.
format Article
id doaj-art-c5b01d36c16a429dbfbe8fdcd2ae0a01
institution DOAJ
issn 2226-4310
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Aerospace
spelling doaj-art-c5b01d36c16a429dbfbe8fdcd2ae0a012025-08-20T03:14:43ZengMDPI AGAerospace2226-43102025-05-0112543810.3390/aerospace12050438Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding PhaseLei Xu0Yingzi Guan1Jialun Pu2Changzhu Wei3School of Astronautics, Harbin Institute of Technology, No. 92 West Dazhi Street, Harbin 150001, ChinaSchool of Astronautics, Harbin Institute of Technology, No. 92 West Dazhi Street, Harbin 150001, ChinaSchool of Astronautics, Harbin Institute of Technology, No. 92 West Dazhi Street, Harbin 150001, ChinaSchool of Astronautics, Harbin Institute of Technology, No. 92 West Dazhi Street, Harbin 150001, ChinaTo enhance the penetration capability of hypersonic vehicles in the gliding phase, an intelligent maneuvering penetration strategy combining imitation learning and reinforcement learning is proposed. Firstly, a reinforcement learning penetration model for hypersonic vehicles is established based on the Markov Decision Process (MDP), with the design of state, action spaces, and composite reward function based on Zero-Effort Miss (ZEM). Furthermore, to overcome the difficulties in training reinforcement learning models, a truncated horizon method is employed to integrate reinforcement learning with imitation learning at the level of the optimization target. This results in the construction of a Truncated Horizon Imitation Learning Soft Actor–Critic (THIL-SAC) intelligent penetration strategy learning model, enabling a smooth transition from imitation to exploration. Finally, reward shaping and expert policies are introduced to enhance the training process. Simulation results demonstrate that the THIL-SAC strategy achieves faster convergence compared to the standard SAC method and outperforms expert strategies. Additionally, the THIL-SAC strategy meets real-time requirements for high-speed penetration scenarios, offering improved adaptability and penetration performance.https://www.mdpi.com/2226-4310/12/5/438hypersonic vehiclepenetration in gliding phaseimitation learningdeep reinforcement learningtruncated horizon
spellingShingle Lei Xu
Yingzi Guan
Jialun Pu
Changzhu Wei
Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
Aerospace
hypersonic vehicle
penetration in gliding phase
imitation learning
deep reinforcement learning
truncated horizon
title Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
title_full Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
title_fullStr Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
title_full_unstemmed Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
title_short Imitation-Reinforcement Learning Penetration Strategy for Hypersonic Vehicle in Gliding Phase
title_sort imitation reinforcement learning penetration strategy for hypersonic vehicle in gliding phase
topic hypersonic vehicle
penetration in gliding phase
imitation learning
deep reinforcement learning
truncated horizon
url https://www.mdpi.com/2226-4310/12/5/438
work_keys_str_mv AT leixu imitationreinforcementlearningpenetrationstrategyforhypersonicvehicleinglidingphase
AT yingziguan imitationreinforcementlearningpenetrationstrategyforhypersonicvehicleinglidingphase
AT jialunpu imitationreinforcementlearningpenetrationstrategyforhypersonicvehicleinglidingphase
AT changzhuwei imitationreinforcementlearningpenetrationstrategyforhypersonicvehicleinglidingphase