Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms

The depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobi...

Full description

Saved in:
Bibliographic Details
Main Authors: Li Wei, Meng Ding, Shuai Li
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/8/4267
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849712354207465472
author Li Wei
Meng Ding
Shuai Li
author_facet Li Wei
Meng Ding
Shuai Li
author_sort Li Wei
collection DOAJ
description The depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobile platforms, including unmanned aerial vehicles (UAVs) and autonomous ground vehicles (AGVs). Building on the foundational framework of <i>Monodepth2</i>, we introduce an intermediate module between the encoder and decoder of the depth estimation network to facilitate multiscale fusion of feature maps. Additionally, we integrate the channel attention mechanism ECANet into the depth estimation network to enhance the significance of important channels. Consequently, the proposed method addresses the issue of losing critical features, which can lead to diminished accuracy and robustness. The experiments presented in this paper are conducted on two datasets: <i>KITTI</i>, a publicly available dataset collected from real-world environments used to evaluate depth estimation performance for AGV platforms, and <i>AirSim</i>, a custom dataset generated using simulation software to assess depth estimation performance for UAV platforms. The experimental results demonstrate that the proposed method can overcome the adverse effects of varying working conditions and accurately perceive detailed depth information in specific regions, such as object edges and targets of different scales. Furthermore, the depth predicted by the proposed method is quantitatively compared with the ground truth depth, and a variety of evaluation metrics confirm that our method exhibits superior inference capability and robustness.
format Article
id doaj-art-d60bc6aa1a534390a5d585a8729faef6
institution DOAJ
issn 2076-3417
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-d60bc6aa1a534390a5d585a8729faef62025-08-20T03:14:17ZengMDPI AGApplied Sciences2076-34172025-04-01158426710.3390/app15084267Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile PlatformsLi Wei0Meng Ding1Shuai Li2School of Information Engineering, Nanhang Jincheng College, Nanjing 211156, ChinaCollege of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, ChinaCollege of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, ChinaThe depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobile platforms, including unmanned aerial vehicles (UAVs) and autonomous ground vehicles (AGVs). Building on the foundational framework of <i>Monodepth2</i>, we introduce an intermediate module between the encoder and decoder of the depth estimation network to facilitate multiscale fusion of feature maps. Additionally, we integrate the channel attention mechanism ECANet into the depth estimation network to enhance the significance of important channels. Consequently, the proposed method addresses the issue of losing critical features, which can lead to diminished accuracy and robustness. The experiments presented in this paper are conducted on two datasets: <i>KITTI</i>, a publicly available dataset collected from real-world environments used to evaluate depth estimation performance for AGV platforms, and <i>AirSim</i>, a custom dataset generated using simulation software to assess depth estimation performance for UAV platforms. The experimental results demonstrate that the proposed method can overcome the adverse effects of varying working conditions and accurately perceive detailed depth information in specific regions, such as object edges and targets of different scales. Furthermore, the depth predicted by the proposed method is quantitatively compared with the ground truth depth, and a variety of evaluation metrics confirm that our method exhibits superior inference capability and robustness.https://www.mdpi.com/2076-3417/15/8/4267monocular visiondepth estimationmobile platformmultiscale feature fusionattention mechanism
spellingShingle Li Wei
Meng Ding
Shuai Li
Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
Applied Sciences
monocular vision
depth estimation
mobile platform
multiscale feature fusion
attention mechanism
title Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
title_full Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
title_fullStr Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
title_full_unstemmed Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
title_short Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
title_sort monocular vision based depth estimation of forward looking scenes for mobile platforms
topic monocular vision
depth estimation
mobile platform
multiscale feature fusion
attention mechanism
url https://www.mdpi.com/2076-3417/15/8/4267
work_keys_str_mv AT liwei monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms
AT mengding monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms
AT shuaili monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms