Improved stereo matching network based on dense multi-scale feature guided cost aggregation

To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures, no texture, and edge, an improved dense multi-scale feature guided aggregation network (DGNet) based on PSMNet was proposed. Firstly, a dense multi-scale feature ext...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG Bo, ZHANG Meiling, LI Xue, ZHU Lei
Format: Article
Language:zho
Published: Editorial Office of Journal of XPU 2024-02-01
Series:Xi'an Gongcheng Daxue xuebao
Subjects:
Online Access:http://journal.xpu.edu.cn/en/#/digest?ArticleID=1442
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures, no texture, and edge, an improved dense multi-scale feature guided aggregation network (DGNet) based on PSMNet was proposed. Firstly, a dense multi-scale feature extraction module was designed based on the dense atrous spatial pyramid pooling structure. This module extracted region-level features of different scales by using atrous convolution of different expansion rates, and effectively fused image features of different scales through dense connection, so that the network can capture contextual information. Secondly, the initial cost volume was obtained by concatenating left feature maps with their corresponding right feature maps across each disparity level. Then, a dense multi-scale feature guided cost aggregation module was proposed, which adaptively fused the cost volume and dense multi-scale features while aggregating the cost volume, so that the subsequent decoding layers can decode more accurate and high-resolution geometry information with the guidance of multi-scale context information. Finally, the high-resolution cost volume with global optimization was input into the regression module to obtain the disparity map. Comprehensive experimental results demonstrated that the mismatching rate of the proposed algorithm on KITTI 2015 and KITTI 2012 datasets was respectively reduced to 1.76% and 1.24%, and the endpoint error on SceneFlow dataset was reduced to 0.56 px. Compared with existing stereo matching algorithms such as GWCNet and CPOP-Net, the proposed algorithm performs well in the ill-posed regions.
ISSN:1674-649X