Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency

This paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitations, including blurred object boundaries and discontinuous arti...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang Yeop Lee, Dong Ju Kim, Young Joo Suh, Do Kyung Hwang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10818481/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850150708653850624
author Chang Yeop Lee
Dong Ju Kim
Young Joo Suh
Do Kyung Hwang
author_facet Chang Yeop Lee
Dong Ju Kim
Young Joo Suh
Do Kyung Hwang
author_sort Chang Yeop Lee
collection DOAJ
description This paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitations, including blurred object boundaries and discontinuous artifacts, while preserving quantitative performance. Beyond addressing these visual challenges, the innovative KD framework reduces the complexity of depth information, thereby significantly enhancing computational efficiency. To validate the effectiveness of the proposed framework, extensive comparative evaluations were performed using state-of-the-art models, including AdaBins, LocalBins, BinsFormer, PixelFormer, and ZoeDepth. These evaluations were conducted on benchmark datasets, including NYU Depth V2 and SUN RGB-D for indoor environments and KITTI for outdoor scenarios, to ensure a rigorous and comprehensive assessment of robustness and generalization capabilities. The results demonstrate that the proposed KD framework outperforms existing methods in visual quality across all datasets while achieving notable computational benefits, including a 15.45% reduction in Floating Point Operations (FLOPs) for the LocalBins model and 7.72% for the ZoeDepth model.
format Article
id doaj-art-e9695410d9af4c0998801f8fbafcdb3b
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e9695410d9af4c0998801f8fbafcdb3b2025-08-20T02:26:28ZengIEEEIEEE Access2169-35362025-01-01132763278210.1109/ACCESS.2024.352391210818481Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and EfficiencyChang Yeop Lee0https://orcid.org/0009-0003-0607-9901Dong Ju Kim1https://orcid.org/0009-0009-6950-4200Young Joo Suh2https://orcid.org/0000-0001-7208-1709Do Kyung Hwang3https://orcid.org/0000-0003-4271-5672Institute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South KoreaInstitute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South KoreaInstitute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South KoreaKorea Institute of Robotics and Technology Convergence, Pohang-si, South KoreaThis paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitations, including blurred object boundaries and discontinuous artifacts, while preserving quantitative performance. Beyond addressing these visual challenges, the innovative KD framework reduces the complexity of depth information, thereby significantly enhancing computational efficiency. To validate the effectiveness of the proposed framework, extensive comparative evaluations were performed using state-of-the-art models, including AdaBins, LocalBins, BinsFormer, PixelFormer, and ZoeDepth. These evaluations were conducted on benchmark datasets, including NYU Depth V2 and SUN RGB-D for indoor environments and KITTI for outdoor scenarios, to ensure a rigorous and comprehensive assessment of robustness and generalization capabilities. The results demonstrate that the proposed KD framework outperforms existing methods in visual quality across all datasets while achieving notable computational benefits, including a 15.45% reduction in Floating Point Operations (FLOPs) for the LocalBins model and 7.72% for the ZoeDepth model.https://ieeexplore.ieee.org/document/10818481/Deep learningimage segmentationcomputational complexityknowledge transfer
spellingShingle Chang Yeop Lee
Dong Ju Kim
Young Joo Suh
Do Kyung Hwang
Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
IEEE Access
Deep learning
image segmentation
computational complexity
knowledge transfer
title Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
title_full Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
title_fullStr Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
title_full_unstemmed Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
title_short Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
title_sort improving monocular depth estimation through knowledge distillation better visual quality and efficiency
topic Deep learning
image segmentation
computational complexity
knowledge transfer
url https://ieeexplore.ieee.org/document/10818481/
work_keys_str_mv AT changyeoplee improvingmonoculardepthestimationthroughknowledgedistillationbettervisualqualityandefficiency
AT dongjukim improvingmonoculardepthestimationthroughknowledgedistillationbettervisualqualityandefficiency
AT youngjoosuh improvingmonoculardepthestimationthroughknowledgedistillationbettervisualqualityandefficiency
AT dokyunghwang improvingmonoculardepthestimationthroughknowledgedistillationbettervisualqualityandefficiency