HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention
Binocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convoluti...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2022-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/9870784/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850246099251494912 |
|---|---|
| author | Chenglin Dai Qingling Chang Tian Qiu Xinglin Liu Yan Cui |
| author_facet | Chenglin Dai Qingling Chang Tian Qiu Xinglin Liu Yan Cui |
| author_sort | Chenglin Dai |
| collection | DOAJ |
| description | Binocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convolution neural networks and the attention algorithms used in this field can make great progress, it is still difficult to satisfy the demand of high-precision applications due to many reasons. Study finds that the exist methods usually incline to ignore the intermediate feature map of other scales, pay less attention to the relationship between left and right feature maps and even just tend to use one type of cost volume to train the model. In this article, we mainly focus on solving the three rproblems mentioned above. Firstly, we present the Multi-scale Feature Extraction and Fusion Module (MFEFM) to get the informational feature maps via fusing all scale feature maps. And then we design the Effective Channel Attention Module (ECAM) applied to better capture and utilize the channel-wise independencies. Finally, we adopt the Hybrid Cost Volume Computation Module (HCVCM) to construct and aggregate cost volume. With these solutions, we build an end-to-end stereo matching network named HCVNet. Comparison with other state-of-the-art models, it can achieve 0.714 EPE on SceneFlow dataset, descending PSMNet (1.09 EPE) by 37.6%. |
| format | Article |
| id | doaj-art-b3c8bd4588174f6db73d45c53cd35d2e |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2022-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-b3c8bd4588174f6db73d45c53cd35d2e2025-08-20T01:59:16ZengIEEEIEEE Access2169-35362022-01-0110930629307310.1109/ACCESS.2022.32031759870784HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With AttentionChenglin Dai0Qingling Chang1https://orcid.org/0000-0002-1937-1165Tian Qiu2Xinglin Liu3https://orcid.org/0000-0001-7503-4211Yan Cui4Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaFaculty of Intelligent Manufacturing, Wuyi University, Jiangmen, ChinaBinocular stereo matching, a computer vision task typically using cost volume constructed from the left and right feature maps to estimate disparity and depth, is widely applied in 3D reconstruction, autonomous driving and robotics navigation. Though recent study brings an awareness of the convolution neural networks and the attention algorithms used in this field can make great progress, it is still difficult to satisfy the demand of high-precision applications due to many reasons. Study finds that the exist methods usually incline to ignore the intermediate feature map of other scales, pay less attention to the relationship between left and right feature maps and even just tend to use one type of cost volume to train the model. In this article, we mainly focus on solving the three rproblems mentioned above. Firstly, we present the Multi-scale Feature Extraction and Fusion Module (MFEFM) to get the informational feature maps via fusing all scale feature maps. And then we design the Effective Channel Attention Module (ECAM) applied to better capture and utilize the channel-wise independencies. Finally, we adopt the Hybrid Cost Volume Computation Module (HCVCM) to construct and aggregate cost volume. With these solutions, we build an end-to-end stereo matching network named HCVNet. Comparison with other state-of-the-art models, it can achieve 0.714 EPE on SceneFlow dataset, descending PSMNet (1.09 EPE) by 37.6%.https://ieeexplore.ieee.org/document/9870784/Binocular stereo matchingfeature mapchannel-wise independencieschannel attentioncost volume |
| spellingShingle | Chenglin Dai Qingling Chang Tian Qiu Xinglin Liu Yan Cui HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention IEEE Access Binocular stereo matching feature map channel-wise independencies channel attention cost volume |
| title | HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention |
| title_full | HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention |
| title_fullStr | HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention |
| title_full_unstemmed | HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention |
| title_short | HCVNet: Binocular Stereo Matching via Hybrid Cost Volume Computation Module With Attention |
| title_sort | hcvnet binocular stereo matching via hybrid cost volume computation module with attention |
| topic | Binocular stereo matching feature map channel-wise independencies channel attention cost volume |
| url | https://ieeexplore.ieee.org/document/9870784/ |
| work_keys_str_mv | AT chenglindai hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention AT qinglingchang hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention AT tianqiu hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention AT xinglinliu hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention AT yancui hcvnetbinocularstereomatchingviahybridcostvolumecomputationmodulewithattention |