Vit-Traj: A Spatial–Temporal Coupling Vehicle Trajectory Prediction Model Based on Vision Transformer
Accurately predicting the future trajectory of road users around autonomous vehicles is crucial for path planning and collision avoidance. In recent years, data-driven vehicle trajectory prediction models have become a significant research focus, and various spatial–temporal neural network models, b...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Systems |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2079-8954/13/3/147 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurately predicting the future trajectory of road users around autonomous vehicles is crucial for path planning and collision avoidance. In recent years, data-driven vehicle trajectory prediction models have become a significant research focus, and various spatial–temporal neural network models, based on spatial–temporal data, have been proposed. However, some existing spatial–temporal models segregate time and space, neglecting the inherent coupling of time and space. To address this issue, an end-to-end spatial–temporal feature fusion model, based on the Vision Transformer (Vit), is proposed in this paper, which can couple stereoscopic features of diverse spatial regions and time periods. Specifically, we propose an end-to-end spatiotemporal feature coupling model based on visual Transformer, Vit-Traj, which extracts spatiotemporal features through 2D convolution and uses Vit and SENet to complete feature fusion. Experimental results on the NGSIM and HighD datasets indicate that, compared to State-of-the-Art models, the proposed model exhibits better performance. The root mean squared error (RMSE) is 2.72 m on the NGSIM dataset and 0.86 m on the HighD dataset when the prediction horizon is 5 s. Furthermore, ablation experiments are conducted to evaluate the performance of each module, affirming the efficacy of ViT in modeling spatial–temporal data. |
|---|---|
| ISSN: | 2079-8954 |