Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2
This paper presents a self-supervised monocular position tracking model tailored for greenhouse environments. These environments pose unique challenges: mutual crop shading and homogeneous color textures complicate feature extraction, resulting in blurred depth map boundaries and low-precision posit...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11029014/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849473428944322560 |
|---|---|
| author | Yaheng Cai Yingli Cao |
| author_facet | Yaheng Cai Yingli Cao |
| author_sort | Yaheng Cai |
| collection | DOAJ |
| description | This paper presents a self-supervised monocular position tracking model tailored for greenhouse environments. These environments pose unique challenges: mutual crop shading and homogeneous color textures complicate feature extraction, resulting in blurred depth map boundaries and low-precision position estimation. Building upon the Monodepth2 baseline, the model incorporates three key enhancements: replacing the original backbone with ResNext50 to improve global information acquisition; integrating a hybrid convolution module (HC) into the encoder to expand the receptive field and capture multi-scale contextual features; and introducing a coordinate attention mechanism (CA) in the decoder to enhance discriminative feature extraction. Experiments conducted on a wheeled robot platform in a strawberry greenhouse demonstrate significant improvements: compared to the original backbone, the proposed model reduces position and attitude RMSE by 0.038 m and 0.012 rad, respectively. When compared to a baseline without HC, relative RMSE decreases by 0.048 m and 0.017 rad, while the CA-augmented version achieves RMSE reductions of 0.059 m and 0.034 rad compared to the CA-free variant. These results surpass existing monocular tracking methods, offering a technical foundation for vision system designs in greenhouse mobile robotics. |
| format | Article |
| id | doaj-art-e4a9c7a3cd11450cab5166d2a19fb079 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-e4a9c7a3cd11450cab5166d2a19fb0792025-08-20T03:24:08ZengIEEEIEEE Access2169-35362025-01-011310669010670210.1109/ACCESS.2025.357813511029014Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2Yaheng Cai0https://orcid.org/0009-0006-5286-1402Yingli Cao1https://orcid.org/0000-0002-6655-1302College of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang, ChinaCollege of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang, ChinaThis paper presents a self-supervised monocular position tracking model tailored for greenhouse environments. These environments pose unique challenges: mutual crop shading and homogeneous color textures complicate feature extraction, resulting in blurred depth map boundaries and low-precision position estimation. Building upon the Monodepth2 baseline, the model incorporates three key enhancements: replacing the original backbone with ResNext50 to improve global information acquisition; integrating a hybrid convolution module (HC) into the encoder to expand the receptive field and capture multi-scale contextual features; and introducing a coordinate attention mechanism (CA) in the decoder to enhance discriminative feature extraction. Experiments conducted on a wheeled robot platform in a strawberry greenhouse demonstrate significant improvements: compared to the original backbone, the proposed model reduces position and attitude RMSE by 0.038 m and 0.012 rad, respectively. When compared to a baseline without HC, relative RMSE decreases by 0.048 m and 0.017 rad, while the CA-augmented version achieves RMSE reductions of 0.059 m and 0.034 rad compared to the CA-free variant. These results surpass existing monocular tracking methods, offering a technical foundation for vision system designs in greenhouse mobile robotics.https://ieeexplore.ieee.org/document/11029014/Pose estimationencoder-decodergreenhousemonocular visiondeep learningmobile robots |
| spellingShingle | Yaheng Cai Yingli Cao Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 IEEE Access Pose estimation encoder-decoder greenhouse monocular vision deep learning mobile robots |
| title | Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 |
| title_full | Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 |
| title_fullStr | Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 |
| title_full_unstemmed | Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 |
| title_short | Positional Tracking Study of Greenhouse Mobile Robot Based on Improved Monodepth2 |
| title_sort | positional tracking study of greenhouse mobile robot based on improved monodepth2 |
| topic | Pose estimation encoder-decoder greenhouse monocular vision deep learning mobile robots |
| url | https://ieeexplore.ieee.org/document/11029014/ |
| work_keys_str_mv | AT yahengcai positionaltrackingstudyofgreenhousemobilerobotbasedonimprovedmonodepth2 AT yinglicao positionaltrackingstudyofgreenhousemobilerobotbasedonimprovedmonodepth2 |