Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections
Abstract Autonomous vehicles heavily rely on precise scene understanding to ensure safe navigation. These vehicles house an array of sophisticated sensors and advanced technologies, like computer vision and artificial intelligence, to navigate complex and unpredictable real-world driving scenarios....
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-08-01
|
| Series: | Discover Artificial Intelligence |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44163-025-00455-x |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849332375409917952 |
|---|---|
| author | Siddhant Arora Ahaan Banerjee Nitish Katal |
| author_facet | Siddhant Arora Ahaan Banerjee Nitish Katal |
| author_sort | Siddhant Arora |
| collection | DOAJ |
| description | Abstract Autonomous vehicles heavily rely on precise scene understanding to ensure safe navigation. These vehicles house an array of sophisticated sensors and advanced technologies, like computer vision and artificial intelligence, to navigate complex and unpredictable real-world driving scenarios. Semantic segmentation is a primary method that enables AVs to perceive and understand their environment. As the driving scenes characterize dynamic scenarios with unpredictable movements of other vehicles, pedestrians, cyclists, and animals; it becomes necessary for these vehicles to observe their environment in real time and with high precision; and also demand a high level of precision in the semantic segmentation of these driving scenes. In the proposed work, an efficient UNet inspired architecture, namely ResAttUNet, is proposed; wherein the classical UNet is modified by introducing the attention mechanism in the skip and the introduction of the residual connections in each encoder and decoder block to build a deeper model. The proposed work evaluates the integration of the residual connections and the attention gate for segmentation; the residual connections enable deeper models, and the inclusion of attention gate in the skip layers of the UNet enables the model to decisively prioritize the critical information to enhance the overall capability. The evaluation was carried out on the CamVid dataset, and it was observed that the proposed ResAttUNet offers superior performance over existing models, such as FCN, PSPNet, and SegFast-Mobile, with higher accuracies and intersection over union (IOU) metrics. ResAttUNet surpasses existing state-of-the-art models, achieving a pixel-level accuracy of 98.78% and mean IOU of 0.5321. |
| format | Article |
| id | doaj-art-85bf441b16554a868eab5d0c7d766209 |
| institution | Kabale University |
| issn | 2731-0809 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Springer |
| record_format | Article |
| series | Discover Artificial Intelligence |
| spelling | doaj-art-85bf441b16554a868eab5d0c7d7662092025-08-20T03:46:12ZengSpringerDiscover Artificial Intelligence2731-08092025-08-015112510.1007/s44163-025-00455-xEnhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connectionsSiddhant Arora0Ahaan Banerjee1Nitish Katal2School of Computer Science and Engineering, Vellore Institute of TechnologySchool of Computer Science and Engineering, Vellore Institute of TechnologySchool of Electronics Engineering, Vellore Institute of TechnologyAbstract Autonomous vehicles heavily rely on precise scene understanding to ensure safe navigation. These vehicles house an array of sophisticated sensors and advanced technologies, like computer vision and artificial intelligence, to navigate complex and unpredictable real-world driving scenarios. Semantic segmentation is a primary method that enables AVs to perceive and understand their environment. As the driving scenes characterize dynamic scenarios with unpredictable movements of other vehicles, pedestrians, cyclists, and animals; it becomes necessary for these vehicles to observe their environment in real time and with high precision; and also demand a high level of precision in the semantic segmentation of these driving scenes. In the proposed work, an efficient UNet inspired architecture, namely ResAttUNet, is proposed; wherein the classical UNet is modified by introducing the attention mechanism in the skip and the introduction of the residual connections in each encoder and decoder block to build a deeper model. The proposed work evaluates the integration of the residual connections and the attention gate for segmentation; the residual connections enable deeper models, and the inclusion of attention gate in the skip layers of the UNet enables the model to decisively prioritize the critical information to enhance the overall capability. The evaluation was carried out on the CamVid dataset, and it was observed that the proposed ResAttUNet offers superior performance over existing models, such as FCN, PSPNet, and SegFast-Mobile, with higher accuracies and intersection over union (IOU) metrics. ResAttUNet surpasses existing state-of-the-art models, achieving a pixel-level accuracy of 98.78% and mean IOU of 0.5321.https://doi.org/10.1007/s44163-025-00455-xAttention gateAutonomous drivingResidual learningSemantic segmentationUNetConvolutional neural networks (CNNs) |
| spellingShingle | Siddhant Arora Ahaan Banerjee Nitish Katal Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections Discover Artificial Intelligence Attention gate Autonomous driving Residual learning Semantic segmentation UNet Convolutional neural networks (CNNs) |
| title | Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections |
| title_full | Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections |
| title_fullStr | Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections |
| title_full_unstemmed | Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections |
| title_short | Enhanced urban driving scene segmentation using modified UNet with residual convolutions and attention guided skip connections |
| title_sort | enhanced urban driving scene segmentation using modified unet with residual convolutions and attention guided skip connections |
| topic | Attention gate Autonomous driving Residual learning Semantic segmentation UNet Convolutional neural networks (CNNs) |
| url | https://doi.org/10.1007/s44163-025-00455-x |
| work_keys_str_mv | AT siddhantarora enhancedurbandrivingscenesegmentationusingmodifiedunetwithresidualconvolutionsandattentionguidedskipconnections AT ahaanbanerjee enhancedurbandrivingscenesegmentationusingmodifiedunetwithresidualconvolutionsandattentionguidedskipconnections AT nitishkatal enhancedurbandrivingscenesegmentationusingmodifiedunetwithresidualconvolutionsandattentionguidedskipconnections |