Deep Contextual Structure and Semantic Feature Enhancement Stereo Network
Depth estimation is one of the fundamental tasks of computer vision. Stereo matching is the most critical step to obtain the accurate depth information through stereo vision. At present, thin structure regions, depth discontinuity regions, and large textureless regions are still the difficult issues...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10556539/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850258298414039040 |
|---|---|
| author | Guowei An Yaonan Wang Kai Zeng Qing Zhu Xiaofang Yuan Yang Mo |
| author_facet | Guowei An Yaonan Wang Kai Zeng Qing Zhu Xiaofang Yuan Yang Mo |
| author_sort | Guowei An |
| collection | DOAJ |
| description | Depth estimation is one of the fundamental tasks of computer vision. Stereo matching is the most critical step to obtain the accurate depth information through stereo vision. At present, thin structure regions, depth discontinuity regions, and large textureless regions are still the difficult issues for stereo matching. To address the blur in thin structure regions and the dilation in depth discontinuity regions, the contextual structure enhancing module is proposed to enhance the extraction ability for local contextual features of the feature extraction network. To reduce the matching ambiguity in large textureless regions, the semantic feature enhancing module is proposed to enhance the aggregation ability for semantic features of the cost aggregation network. Extensive experiment results show that the proposed stereo network perform well in thin structure regions, depth discontinuity regions and large textureless regions and has achieved excellent performance on Scene Flow datasets, KITTI 2012 datasets, KITTI 2015 datasets and Middlebury datasets. |
| format | Article |
| id | doaj-art-3a6d81f314444ea68d76d2b41a28dcc2 |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-3a6d81f314444ea68d76d2b41a28dcc22025-08-20T01:56:13ZengIEEEIEEE Access2169-35362024-01-011218120518121610.1109/ACCESS.2024.341395710556539Deep Contextual Structure and Semantic Feature Enhancement Stereo NetworkGuowei An0https://orcid.org/0000-0001-6490-4277Yaonan Wang1https://orcid.org/0000-0002-0519-6458Kai Zeng2https://orcid.org/0000-0002-2745-1253Qing Zhu3https://orcid.org/0000-0001-7785-6374Xiaofang Yuan4https://orcid.org/0000-0001-7280-7207Yang Mo5https://orcid.org/0000-0001-6734-8691College of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaDepth estimation is one of the fundamental tasks of computer vision. Stereo matching is the most critical step to obtain the accurate depth information through stereo vision. At present, thin structure regions, depth discontinuity regions, and large textureless regions are still the difficult issues for stereo matching. To address the blur in thin structure regions and the dilation in depth discontinuity regions, the contextual structure enhancing module is proposed to enhance the extraction ability for local contextual features of the feature extraction network. To reduce the matching ambiguity in large textureless regions, the semantic feature enhancing module is proposed to enhance the aggregation ability for semantic features of the cost aggregation network. Extensive experiment results show that the proposed stereo network perform well in thin structure regions, depth discontinuity regions and large textureless regions and has achieved excellent performance on Scene Flow datasets, KITTI 2012 datasets, KITTI 2015 datasets and Middlebury datasets.https://ieeexplore.ieee.org/document/10556539/Stereo matchingdeep learningneural networkfeature extractionaggregation |
| spellingShingle | Guowei An Yaonan Wang Kai Zeng Qing Zhu Xiaofang Yuan Yang Mo Deep Contextual Structure and Semantic Feature Enhancement Stereo Network IEEE Access Stereo matching deep learning neural network feature extraction aggregation |
| title | Deep Contextual Structure and Semantic Feature Enhancement Stereo Network |
| title_full | Deep Contextual Structure and Semantic Feature Enhancement Stereo Network |
| title_fullStr | Deep Contextual Structure and Semantic Feature Enhancement Stereo Network |
| title_full_unstemmed | Deep Contextual Structure and Semantic Feature Enhancement Stereo Network |
| title_short | Deep Contextual Structure and Semantic Feature Enhancement Stereo Network |
| title_sort | deep contextual structure and semantic feature enhancement stereo network |
| topic | Stereo matching deep learning neural network feature extraction aggregation |
| url | https://ieeexplore.ieee.org/document/10556539/ |
| work_keys_str_mv | AT guoweian deepcontextualstructureandsemanticfeatureenhancementstereonetwork AT yaonanwang deepcontextualstructureandsemanticfeatureenhancementstereonetwork AT kaizeng deepcontextualstructureandsemanticfeatureenhancementstereonetwork AT qingzhu deepcontextualstructureandsemanticfeatureenhancementstereonetwork AT xiaofangyuan deepcontextualstructureandsemanticfeatureenhancementstereonetwork AT yangmo deepcontextualstructureandsemanticfeatureenhancementstereonetwork |