Model-Based Offline Reinforcement Learning for AUV Path-Following Under Unknown Ocean Currents with Limited Data

Minimizing experimental data while maintaining good AUV path-following performance is essential to reduce controller design costs and ensure AUV safety, particularly in complex and dynamic underwater environments with unknown ocean currents. To address this, we propose a conservative offline model-b...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinmao Li, Lingbo Geng, Kaizhou Liu, Yifeng Zhao
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/3/201
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Minimizing experimental data while maintaining good AUV path-following performance is essential to reduce controller design costs and ensure AUV safety, particularly in complex and dynamic underwater environments with unknown ocean currents. To address this, we propose a conservative offline model-based Q-learning (CMQL) algorithm. This algorithm is robust to unknown disturbance and efficient in data utilization. The CMQL-based controller is trained offline with dynamics and kinematics models constructed from limited AUV motion data and requires no additional fine-tuning for deployment. These models, constructed by improved conditional neural processes, enable accurate long-term motion state predictions within the data distribution. Additionally, the carefully designed state space, action space, reward function, and domain randomization ensure strong generalization and disturbance rejection without extra compensation. Simulation results demonstrate that CMQL achieves effective path-following under unknown ocean currents with a limited dataset of only 1000 data points. This method also achieves zero-shot transfer, demonstrating its generalization and potential for real-world applications.
ISSN:2504-446X