Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms
The depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobi...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/8/4267 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849712354207465472 |
|---|---|
| author | Li Wei Meng Ding Shuai Li |
| author_facet | Li Wei Meng Ding Shuai Li |
| author_sort | Li Wei |
| collection | DOAJ |
| description | The depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobile platforms, including unmanned aerial vehicles (UAVs) and autonomous ground vehicles (AGVs). Building on the foundational framework of <i>Monodepth2</i>, we introduce an intermediate module between the encoder and decoder of the depth estimation network to facilitate multiscale fusion of feature maps. Additionally, we integrate the channel attention mechanism ECANet into the depth estimation network to enhance the significance of important channels. Consequently, the proposed method addresses the issue of losing critical features, which can lead to diminished accuracy and robustness. The experiments presented in this paper are conducted on two datasets: <i>KITTI</i>, a publicly available dataset collected from real-world environments used to evaluate depth estimation performance for AGV platforms, and <i>AirSim</i>, a custom dataset generated using simulation software to assess depth estimation performance for UAV platforms. The experimental results demonstrate that the proposed method can overcome the adverse effects of varying working conditions and accurately perceive detailed depth information in specific regions, such as object edges and targets of different scales. Furthermore, the depth predicted by the proposed method is quantitatively compared with the ground truth depth, and a variety of evaluation metrics confirm that our method exhibits superior inference capability and robustness. |
| format | Article |
| id | doaj-art-d60bc6aa1a534390a5d585a8729faef6 |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-d60bc6aa1a534390a5d585a8729faef62025-08-20T03:14:17ZengMDPI AGApplied Sciences2076-34172025-04-01158426710.3390/app15084267Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile PlatformsLi Wei0Meng Ding1Shuai Li2School of Information Engineering, Nanhang Jincheng College, Nanjing 211156, ChinaCollege of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, ChinaCollege of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, ChinaThe depth estimation of forward-looking scenes is one of the fundamental tasks for an Intelligent Mobile Platform to perceive its surrounding environment. In response to this requirement, this paper proposes a self-supervised monocular depth estimation method that can be utilized across various mobile platforms, including unmanned aerial vehicles (UAVs) and autonomous ground vehicles (AGVs). Building on the foundational framework of <i>Monodepth2</i>, we introduce an intermediate module between the encoder and decoder of the depth estimation network to facilitate multiscale fusion of feature maps. Additionally, we integrate the channel attention mechanism ECANet into the depth estimation network to enhance the significance of important channels. Consequently, the proposed method addresses the issue of losing critical features, which can lead to diminished accuracy and robustness. The experiments presented in this paper are conducted on two datasets: <i>KITTI</i>, a publicly available dataset collected from real-world environments used to evaluate depth estimation performance for AGV platforms, and <i>AirSim</i>, a custom dataset generated using simulation software to assess depth estimation performance for UAV platforms. The experimental results demonstrate that the proposed method can overcome the adverse effects of varying working conditions and accurately perceive detailed depth information in specific regions, such as object edges and targets of different scales. Furthermore, the depth predicted by the proposed method is quantitatively compared with the ground truth depth, and a variety of evaluation metrics confirm that our method exhibits superior inference capability and robustness.https://www.mdpi.com/2076-3417/15/8/4267monocular visiondepth estimationmobile platformmultiscale feature fusionattention mechanism |
| spellingShingle | Li Wei Meng Ding Shuai Li Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms Applied Sciences monocular vision depth estimation mobile platform multiscale feature fusion attention mechanism |
| title | Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms |
| title_full | Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms |
| title_fullStr | Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms |
| title_full_unstemmed | Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms |
| title_short | Monocular Vision-Based Depth Estimation of Forward-Looking Scenes for Mobile Platforms |
| title_sort | monocular vision based depth estimation of forward looking scenes for mobile platforms |
| topic | monocular vision depth estimation mobile platform multiscale feature fusion attention mechanism |
| url | https://www.mdpi.com/2076-3417/15/8/4267 |
| work_keys_str_mv | AT liwei monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms AT mengding monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms AT shuaili monocularvisionbaseddepthestimationofforwardlookingscenesformobileplatforms |