HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention

Binocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convoluti...

Full description

Saved in:
Bibliographic Details
Main Authors: Chenglin Dai, Qingling Chang, Tian Qiu, Xinglin Liu, Yan Cui
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9870784/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850246099251494912
author Chenglin Dai
Qingling Chang
Tian Qiu
Xinglin Liu
Yan Cui
author_facet Chenglin Dai
Qingling Chang
Tian Qiu
Xinglin Liu
Yan Cui
author_sort Chenglin Dai
collection DOAJ
description Binocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convolution neural networks and the attention algorithms used in this field can make great progress, it is still difficult to satisfy the demand of high-precision applications due to many reasons. Study finds that the exist methods usually incline to ignore the intermediate feature map of other scales, pay less attention to the relationship between left and right feature maps and even just tend to use one type of cost volume to train the model. In this article, we mainly focus on solving the three rproblems mentioned above. Firstly, we present the Multi-scale Feature Extraction and Fusion Module (MFEFM) to get the informational feature maps via fusing all scale feature maps. And then we design the Effective Channel Attention Module (ECAM) applied to better capture and utilize the channel-wise independencies. Finally, we adopt the Hybrid Cost Volume Computation Module (HCVCM) to construct and aggregate cost volume. With these solutions, we build an end-to-end stereo matching network named HCVNet. Comparison with other state-of-the-art models, it can achieve 0.714 EPE on SceneFlow dataset, descending PSMNet (1.09 EPE) by 37.6%.
format Article
id doaj-art-b3c8bd4588174f6db73d45c53cd35d2e
institution OA Journals
issn 2169-3536
language English
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-b3c8bd4588174f6db73d45c53cd35d2e2025-08-20T01:59:16ZengIEEEIEEE Access2169-35362022-01-0110930629307310.1109/ACCESS.2022.32031759870784HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With AttentionChenglin Dai0Qingling Chang1https://orcid.org/0000-0002-1937-1165Tian Qiu2Xinglin Liu3https://orcid.org/0000-0001-7503-4211Yan Cui4Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaBinocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convolution neural networks and the attention algorithms used in this field can make great progress, it is still difficult to satisfy the demand of high-precision applications due to many reasons. Study finds that the exist methods usually incline to ignore the intermediate feature map of other scales, pay less attention to the relationship between left and right feature maps and even just tend to use one type of cost volume to train the model. In this article, we mainly focus on solving the three rproblems mentioned above. Firstly, we present the Multi-scale Feature Extraction and Fusion Module (MFEFM) to get the informational feature maps via fusing all scale feature maps. And then we design the Effective Channel Attention Module (ECAM) applied to better capture and utilize the channel-wise independencies. Finally, we adopt the Hybrid Cost Volume Computation Module (HCVCM) to construct and aggregate cost volume. With these solutions, we build an end-to-end stereo matching network named HCVNet. Comparison with other state-of-the-art models, it can achieve 0.714 EPE on SceneFlow dataset, descending PSMNet (1.09 EPE) by 37.6%.https://ieeexplore.ieee.org/document/9870784/Binocular stereo matchingfeature mapchannel-wise independencieschannel attentioncost volume
spellingShingle Chenglin Dai
Qingling Chang
Tian Qiu
Xinglin Liu
Yan Cui
HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
IEEE Access
Binocular stereo matching
feature map
channel-wise independencies
channel attention
cost volume
title HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
title_full HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
title_fullStr HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
title_full_unstemmed HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
title_short HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
title_sort hcvnet binocular stereo matching via hybrid cost volume computation module with attention
topic Binocular stereo matching
feature map
channel-wise independencies
channel attention
cost volume
url https://ieeexplore.ieee.org/document/9870784/
work_keys_str_mv AT chenglindai hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention
AT qinglingchang hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention
AT tianqiu hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention
AT xinglinliu hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention
AT yancui hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention