Improved stereo matching network based on dense multi-scale feature guided cost aggregation

To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures, no texture, and edge, an improved dense multi-scale feature guided aggregation network (DGNet) based on PSMNet was proposed. Firstly, a dense multi-scale feature ext...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG Bo, ZHANG Meiling, LI Xue, ZHU Lei
Format: Article
Language:zho
Published: Editorial Office of Journal of XPU 2024-02-01
Series:Xi'an Gongcheng Daxue xuebao
Subjects:
Online Access:http://journal.xpu.edu.cn/en/#/digest?ArticleID=1442
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849727760630546432
author ZHANG Bo
ZHANG Meiling
LI Xue
ZHU Lei
author_facet ZHANG Bo
ZHANG Meiling
LI Xue
ZHU Lei
author_sort ZHANG Bo
collection DOAJ
description To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures, no texture, and edge, an improved dense multi-scale feature guided aggregation network (DGNet) based on PSMNet was proposed. Firstly, a dense multi-scale feature extraction module was designed based on the dense atrous spatial pyramid pooling structure. This module extracted region-level features of different scales by using atrous convolution of different expansion rates, and effectively fused image features of different scales through dense connection, so that the network can capture contextual information. Secondly, the initial cost volume was obtained by concatenating left feature maps with their corresponding right feature maps across each disparity level. Then, a dense multi-scale feature guided cost aggregation module was proposed, which adaptively fused the cost volume and dense multi-scale features while aggregating the cost volume, so that the subsequent decoding layers can decode more accurate and high-resolution geometry information with the guidance of multi-scale context information. Finally, the high-resolution cost volume with global optimization was input into the regression module to obtain the disparity map. Comprehensive experimental results demonstrated that the mismatching rate of the proposed algorithm on KITTI 2015 and KITTI 2012 datasets was respectively reduced to 1.76% and 1.24%, and the endpoint error on SceneFlow dataset was reduced to 0.56 px. Compared with existing stereo matching algorithms such as GWCNet and CPOP-Net, the proposed algorithm performs well in the ill-posed regions.
format Article
id doaj-art-9498e1fa2daa48c7945b8096408908b1
institution DOAJ
issn 1674-649X
language zho
publishDate 2024-02-01
publisher Editorial Office of Journal of XPU
record_format Article
series Xi'an Gongcheng Daxue xuebao
spelling doaj-art-9498e1fa2daa48c7945b8096408908b12025-08-20T03:09:45ZzhoEditorial Office of Journal of XPUXi'an Gongcheng Daxue xuebao1674-649X2024-02-0138112113010.13338/j.issn.1674-649x.2024.01.016Improved stereo matching network based on dense multi-scale feature guided cost aggregationZHANG Bo0ZHANG Meiling1LI Xue2ZHU Lei3School of Electronics and Information, Xi’an Polytechnic University, Xi’an 710048, ChinaSchool of Electronics and Information, Xi’an Polytechnic University, Xi’an 710048, ChinaSchool of Electronics and Information, Xi’an Polytechnic University, Xi’an 710048, ChinaSchool of Electronics and Information, Xi’an Polytechnic University, Xi’an 710048, ChinaTo further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures, no texture, and edge, an improved dense multi-scale feature guided aggregation network (DGNet) based on PSMNet was proposed. Firstly, a dense multi-scale feature extraction module was designed based on the dense atrous spatial pyramid pooling structure. This module extracted region-level features of different scales by using atrous convolution of different expansion rates, and effectively fused image features of different scales through dense connection, so that the network can capture contextual information. Secondly, the initial cost volume was obtained by concatenating left feature maps with their corresponding right feature maps across each disparity level. Then, a dense multi-scale feature guided cost aggregation module was proposed, which adaptively fused the cost volume and dense multi-scale features while aggregating the cost volume, so that the subsequent decoding layers can decode more accurate and high-resolution geometry information with the guidance of multi-scale context information. Finally, the high-resolution cost volume with global optimization was input into the regression module to obtain the disparity map. Comprehensive experimental results demonstrated that the mismatching rate of the proposed algorithm on KITTI 2015 and KITTI 2012 datasets was respectively reduced to 1.76% and 1.24%, and the endpoint error on SceneFlow dataset was reduced to 0.56 px. Compared with existing stereo matching algorithms such as GWCNet and CPOP-Net, the proposed algorithm performs well in the ill-posed regions.http://journal.xpu.edu.cn/en/#/digest?ArticleID=1442binocular visionstereo matchingdense multi-scale featuresadaptive fusion
spellingShingle ZHANG Bo
ZHANG Meiling
LI Xue
ZHU Lei
Improved stereo matching network based on dense multi-scale feature guided cost aggregation
Xi'an Gongcheng Daxue xuebao
binocular vision
stereo matching
dense multi-scale features
adaptive fusion
title Improved stereo matching network based on dense multi-scale feature guided cost aggregation
title_full Improved stereo matching network based on dense multi-scale feature guided cost aggregation
title_fullStr Improved stereo matching network based on dense multi-scale feature guided cost aggregation
title_full_unstemmed Improved stereo matching network based on dense multi-scale feature guided cost aggregation
title_short Improved stereo matching network based on dense multi-scale feature guided cost aggregation
title_sort improved stereo matching network based on dense multi scale feature guided cost aggregation
topic binocular vision
stereo matching
dense multi-scale features
adaptive fusion
url http://journal.xpu.edu.cn/en/#/digest?ArticleID=1442
work_keys_str_mv AT zhangbo improvedstereomatchingnetworkbasedondensemultiscalefeatureguidedcostaggregation
AT zhangmeiling improvedstereomatchingnetworkbasedondensemultiscalefeatureguidedcostaggregation
AT lixue improvedstereomatchingnetworkbasedondensemultiscalefeatureguidedcostaggregation
AT zhulei improvedstereomatchingnetworkbasedondensemultiscalefeatureguidedcostaggregation