Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach

Vehicle-to-vehicle communication enables capturing sensor information from diverse perspectives, greatly aiding in semantic scene completion in autonomous driving. However, the misalignment of features between ego vehicle and cooperative vehicles leads to ambiguity problems, affecting accuracy and s...

Full description

Saved in:
Bibliographic Details
Main Authors: Junxuan Li, Yuanfang Zhang, Jiayi Han, Peng Han, Kaiqing Luo
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/23/7702
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850260206486814720
author Junxuan Li
Yuanfang Zhang
Jiayi Han
Peng Han
Kaiqing Luo
author_facet Junxuan Li
Yuanfang Zhang
Jiayi Han
Peng Han
Kaiqing Luo
author_sort Junxuan Li
collection DOAJ
description Vehicle-to-vehicle communication enables capturing sensor information from diverse perspectives, greatly aiding in semantic scene completion in autonomous driving. However, the misalignment of features between ego vehicle and cooperative vehicles leads to ambiguity problems, affecting accuracy and semantic information. In this paper, we propose a Two-Stream Multi-Vehicle collaboration approach (TSMV), which divides the features of collaborative vehicles into two streams and regresses interactively. To overcome the problems caused by feature misalignment, the Neighborhood Self-Cross Attention Transformer (NSCAT) module is designed to enable the ego vehicle to query the most similar local features from collaborative vehicles through cross-attention, rather than assuming spatial-temporal synchronization. A 3D occupancy map is finally generated from the features of collaborative vehicle aggregation. Experimental results on both V2VSSC and SemanticOPV2V datasets demonstrate TSMV outpace state-of-the-art collaborative semantic scene completion techniques.
format Article
id doaj-art-c080562c346d4796b49274bb0870f660
institution OA Journals
issn 1424-8220
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-c080562c346d4796b49274bb0870f6602025-08-20T01:55:41ZengMDPI AGSensors1424-82202024-12-012423770210.3390/s24237702Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration ApproachJunxuan Li0Yuanfang Zhang1Jiayi Han2Peng Han3Kaiqing Luo4Guangdong Provincial Engineering Research Center for Optoelectronic Instrument, School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan 528225, ChinaSchool of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaInspur Group, Ji’nan 250000, ChinaGuangdong Provincial Engineering Research Center for Optoelectronic Instrument, School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan 528225, ChinaGuangdong Provincial Engineering Research Center for Optoelectronic Instrument, School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan 528225, ChinaVehicle-to-vehicle communication enables capturing sensor information from diverse perspectives, greatly aiding in semantic scene completion in autonomous driving. However, the misalignment of features between ego vehicle and cooperative vehicles leads to ambiguity problems, affecting accuracy and semantic information. In this paper, we propose a Two-Stream Multi-Vehicle collaboration approach (TSMV), which divides the features of collaborative vehicles into two streams and regresses interactively. To overcome the problems caused by feature misalignment, the Neighborhood Self-Cross Attention Transformer (NSCAT) module is designed to enable the ego vehicle to query the most similar local features from collaborative vehicles through cross-attention, rather than assuming spatial-temporal synchronization. A 3D occupancy map is finally generated from the features of collaborative vehicle aggregation. Experimental results on both V2VSSC and SemanticOPV2V datasets demonstrate TSMV outpace state-of-the-art collaborative semantic scene completion techniques.https://www.mdpi.com/1424-8220/24/23/7702semantic scene completionneighborhood attention transformermulti-vehicle collaborative perception
spellingShingle Junxuan Li
Yuanfang Zhang
Jiayi Han
Peng Han
Kaiqing Luo
Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
Sensors
semantic scene completion
neighborhood attention transformer
multi-vehicle collaborative perception
title Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
title_full Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
title_fullStr Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
title_full_unstemmed Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
title_short Semantic Scene Completion in Autonomous Driving: A Two-Stream Multi-Vehicle Collaboration Approach
title_sort semantic scene completion in autonomous driving a two stream multi vehicle collaboration approach
topic semantic scene completion
neighborhood attention transformer
multi-vehicle collaborative perception
url https://www.mdpi.com/1424-8220/24/23/7702
work_keys_str_mv AT junxuanli semanticscenecompletioninautonomousdrivingatwostreammultivehiclecollaborationapproach
AT yuanfangzhang semanticscenecompletioninautonomousdrivingatwostreammultivehiclecollaborationapproach
AT jiayihan semanticscenecompletioninautonomousdrivingatwostreammultivehiclecollaborationapproach
AT penghan semanticscenecompletioninautonomousdrivingatwostreammultivehiclecollaborationapproach
AT kaiqingluo semanticscenecompletioninautonomousdrivingatwostreammultivehiclecollaborationapproach