Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles

This paper presents a general framework that integrates visual and acoustic sensor data to enhance localization and mapping in complex, highly dynamic underwater environments, with a particular focus on fish farming. The pipeline enables net-relative pose estimation for Unmanned Underwater Vehicles...

Full description

Saved in:

Bibliographic Details
Main Authors:	Marco Job, David Botta, Victor Reijgwart, Luca Ebner, Andrej Studer, Roland Siegwart, Eleni Kelasidi
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-06-01
Series:	Frontiers in Robotics and AI
Subjects:	localization mapping UUVs depth prediction aquaculture
Online Access:	https://www.frontiersin.org/articles/10.3389/frobt.2025.1609765/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850157067086594048
author	Marco Job Marco Job David Botta Victor Reijgwart Luca Ebner Andrej Studer Roland Siegwart Eleni Kelasidi Eleni Kelasidi Eleni Kelasidi
author_facet	Marco Job Marco Job David Botta Victor Reijgwart Luca Ebner Andrej Studer Roland Siegwart Eleni Kelasidi Eleni Kelasidi Eleni Kelasidi
author_sort	Marco Job
collection	DOAJ
description	This paper presents a general framework that integrates visual and acoustic sensor data to enhance localization and mapping in complex, highly dynamic underwater environments, with a particular focus on fish farming. The pipeline enables net-relative pose estimation for Unmanned Underwater Vehicles (UUVs) and depth prediction within net pens solely from visual data by combining deep learning-based monocular depth prediction with sparse depth priors derived from a classical Fast Fourier Transform (FFT)-based method. We further introduce a method to estimate a UUV’s global pose by fusing these net-relative estimates with acoustic measurements, and demonstrate how the predicted depth images can be integrated into the wavemap mapping framework to generate detailed 3D maps in real-time. Extensive evaluations on datasets collected in industrial-scale fish farms confirm that the presented framework can be used to accurately estimate a UUV’s net-relative and global position in real-time, and provide 3D maps suitable for autonomous navigation and inspection.
format	Article
id	doaj-art-d9fd88c966dc41bc80da9e243b3ddbdf
institution	OA Journals
issn	2296-9144
language	English
publishDate	2025-06-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Robotics and AI
spelling	doaj-art-d9fd88c966dc41bc80da9e243b3ddbdf2025-08-20T02:24:17ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442025-06-011210.3389/frobt.2025.16097651609765Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehiclesMarco Job0Marco Job1David Botta2Victor Reijgwart3Luca Ebner4Andrej Studer5Roland Siegwart6Eleni Kelasidi7Eleni Kelasidi8Eleni Kelasidi9Autonomous Systems Lab, Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH, Zurich, Zurich, SwitzerlandDepartment of Mechanical and Industrial Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, NorwayAutonomous Systems Lab, Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH, Zurich, Zurich, SwitzerlandAutonomous Systems Lab, Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH, Zurich, Zurich, SwitzerlandTethys Robotics, Zurich, SwitzerlandTethys Robotics, Zurich, SwitzerlandAutonomous Systems Lab, Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH, Zurich, Zurich, SwitzerlandAutonomous Systems Lab, Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH, Zurich, Zurich, SwitzerlandDepartment of Mechanical and Industrial Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, NorwayAquaculture Robotics and Automation Group, SINTEF Ocean, Trondheim, NorwayThis paper presents a general framework that integrates visual and acoustic sensor data to enhance localization and mapping in complex, highly dynamic underwater environments, with a particular focus on fish farming. The pipeline enables net-relative pose estimation for Unmanned Underwater Vehicles (UUVs) and depth prediction within net pens solely from visual data by combining deep learning-based monocular depth prediction with sparse depth priors derived from a classical Fast Fourier Transform (FFT)-based method. We further introduce a method to estimate a UUV’s global pose by fusing these net-relative estimates with acoustic measurements, and demonstrate how the predicted depth images can be integrated into the wavemap mapping framework to generate detailed 3D maps in real-time. Extensive evaluations on datasets collected in industrial-scale fish farms confirm that the presented framework can be used to accurately estimate a UUV’s net-relative and global position in real-time, and provide 3D maps suitable for autonomous navigation and inspection.https://www.frontiersin.org/articles/10.3389/frobt.2025.1609765/fulllocalizationmappingUUVsdepth predictionaquaculture
spellingShingle	Marco Job Marco Job David Botta Victor Reijgwart Luca Ebner Andrej Studer Roland Siegwart Eleni Kelasidi Eleni Kelasidi Eleni Kelasidi Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles Frontiers in Robotics and AI localization mapping UUVs depth prediction aquaculture
title	Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
title_full	Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
title_fullStr	Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
title_full_unstemmed	Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
title_short	Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
title_sort	leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles
topic	localization mapping UUVs depth prediction aquaculture
url	https://www.frontiersin.org/articles/10.3389/frobt.2025.1609765/full
work_keys_str_mv	AT marcojob leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT marcojob leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT davidbotta leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT victorreijgwart leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT lucaebner leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT andrejstuder leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT rolandsiegwart leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT elenikelasidi leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT elenikelasidi leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles AT elenikelasidi leveraginglearnedmonoculardepthpredictionforposeestimationandmappingonunmannedunderwatervehicles

Leveraging learned monocular depth prediction for pose estimation and mapping on unmanned underwater vehicles

Similar Items