GPT-4 enhanced multimodal grounding for autonomous driving: Leveraging cross-modal attention with large language models
In the field of autonomous vehicles (AVs), accurately discerning commander intent and executing linguistic commands within a visual context presents a significant challenge. This paper introduces a sophisticated encoder-decoder framework, developed to address visual grounding in AVs. Our Context-Awa...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-12-01
|
Series: | Communications in Transportation Research |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2772424723000276 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|