Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle
A newly designed tilt-wing unmanned aerial vehicle (Tilt-wing UAV) requires a unified control strategy across rotary-wing, fixed-wing, and transition modes, introducing significant challenges. Existing control strategies typically rely on accurate modeling or extensive parameter tuning, which limits...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Aerospace |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2226-4310/12/5/435 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850254888811888640 |
|---|---|
| author | Shiji Jin Wenjie Zhao |
| author_facet | Shiji Jin Wenjie Zhao |
| author_sort | Shiji Jin |
| collection | DOAJ |
| description | A newly designed tilt-wing unmanned aerial vehicle (Tilt-wing UAV) requires a unified control strategy across rotary-wing, fixed-wing, and transition modes, introducing significant challenges. Existing control strategies typically rely on accurate modeling or extensive parameter tuning, which limits their adaptability to dynamically changing flight configurations. Although online reinforcement learning algorithms offer adaptability, they depend on real-world exploration, posing considerable safety and cost risks for safety-critical UAV applications. To address this challenge, we propose Temporal Sequence Constrained Q-learning (TSCQ), an offline RL framework that integrates an encoder–decoder with recurrent networks to capture temporal dependencies. The policy is further constrained within an offline dataset collected via hardware-in-the-loop simulation using a variational autoencoder, and a sequence-level prediction mechanism is introduced to ensure temporal consistency across action trajectories, thereby mitigating extrapolation error while preserving data fidelity. Experimental results demonstrate that TSCQ significantly outperforms gain scheduling, Model Predictive Control (MPC), and Batch-Constrained Q-learning (BCQ), reducing the RMSE of pitch angle by up to 53.3% and vertical velocity RMSE by approximately 33%. These findings underscore the potential of data-driven, safety-aware offline RL paradigms to enable robust and generalizable control strategies for tilt-wing UAVs. |
| format | Article |
| id | doaj-art-8ecf7924aae540a0b824a3a29be0fa33 |
| institution | OA Journals |
| issn | 2226-4310 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Aerospace |
| spelling | doaj-art-8ecf7924aae540a0b824a3a29be0fa332025-08-20T01:57:00ZengMDPI AGAerospace2226-43102025-05-0112543510.3390/aerospace12050435Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial VehicleShiji Jin0Wenjie Zhao1School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, ChinaSchool of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, ChinaA newly designed tilt-wing unmanned aerial vehicle (Tilt-wing UAV) requires a unified control strategy across rotary-wing, fixed-wing, and transition modes, introducing significant challenges. Existing control strategies typically rely on accurate modeling or extensive parameter tuning, which limits their adaptability to dynamically changing flight configurations. Although online reinforcement learning algorithms offer adaptability, they depend on real-world exploration, posing considerable safety and cost risks for safety-critical UAV applications. To address this challenge, we propose Temporal Sequence Constrained Q-learning (TSCQ), an offline RL framework that integrates an encoder–decoder with recurrent networks to capture temporal dependencies. The policy is further constrained within an offline dataset collected via hardware-in-the-loop simulation using a variational autoencoder, and a sequence-level prediction mechanism is introduced to ensure temporal consistency across action trajectories, thereby mitigating extrapolation error while preserving data fidelity. Experimental results demonstrate that TSCQ significantly outperforms gain scheduling, Model Predictive Control (MPC), and Batch-Constrained Q-learning (BCQ), reducing the RMSE of pitch angle by up to 53.3% and vertical velocity RMSE by approximately 33%. These findings underscore the potential of data-driven, safety-aware offline RL paradigms to enable robust and generalizable control strategies for tilt-wing UAVs.https://www.mdpi.com/2226-4310/12/5/435tilt-wing UAVVTOL UAVmode transition controloffline reinforcement learning |
| spellingShingle | Shiji Jin Wenjie Zhao Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle Aerospace tilt-wing UAV VTOL UAV mode transition control offline reinforcement learning |
| title | Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle |
| title_full | Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle |
| title_fullStr | Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle |
| title_full_unstemmed | Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle |
| title_short | Temporal-Sequence Offline Reinforcement Learning for Transition Control of a Novel Tilt-Wing Unmanned Aerial Vehicle |
| title_sort | temporal sequence offline reinforcement learning for transition control of a novel tilt wing unmanned aerial vehicle |
| topic | tilt-wing UAV VTOL UAV mode transition control offline reinforcement learning |
| url | https://www.mdpi.com/2226-4310/12/5/435 |
| work_keys_str_mv | AT shijijin temporalsequenceofflinereinforcementlearningfortransitioncontrolofanoveltiltwingunmannedaerialvehicle AT wenjiezhao temporalsequenceofflinereinforcementlearningfortransitioncontrolofanoveltiltwingunmannedaerialvehicle |