SCA-CVENet: A Spatial-Channel-Attention Collaborative Cost Volume Enhancement Network for High-Quality Depth Reconstruction

Accurate and complete depth map prediction from a set of overlapping multi-view stereo images has received extensive attention in the fields of photogrammetry and computer vision. Despite the tremendous efforts made in multi-view depth map reconstruction in recent years, the accuracy and completenes...

Full description

Saved in:
Bibliographic Details
Main Authors: Mao Tian, Xudong Zhao, Xiong Lv
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10820522/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate and complete depth map prediction from a set of overlapping multi-view stereo images has received extensive attention in the fields of photogrammetry and computer vision. Despite the tremendous efforts made in multi-view depth map reconstruction in recent years, the accuracy and completeness of the reconstructed depth map in matching difficulty areas needs to be further improved. In this paper, to address this issue, we propose SCA-CVENet, which is an efficient spatial-channel-attention collaborative cost volume enhancement network for high-quality depth map reconstruction. The main contributions of SCA-CVENet are the channel attention weight (CAW)-based cost volume enhancement module that can effectively filter the redundant channel information of the feature volumes to boost the feature representation of the cost volumes, and the spatial attention weight (SAW)-based cost volume enhancement module that can significantly improve the robustness of the cost volumes in textureless and repetitive texture areas, to produce more accurate and complete depth map reconstruction results. First, a weight-sharing hierarchical feature extraction subnetwork is adopted to extract robust multi-scale feature maps. Second, the SAW generation subnetwork is adopted, which incorporates the group-wise correlation (GWC)-based cost volume adaptive fusion module and 3D hourglass module to produce an accurate SAW. Finally, the coarse-to-fine cascade depth prediction subnetwork incorporates the CAW- and SAW-based cost volume enhancement modules into the cascade network architecture to predict a high-quality depth map. Three publicly available benchmark datasets—the DTU, BlendedMVS, and Tanks & Temples datasets—were utilized to validate the performance of SCA-CVENet, and the comprehensive experimental results demonstrate that SCA-CVENet obtains a superior performance on all the datasets in terms of accuracy and completeness.
ISSN:2169-3536