A GAN-Based Framework with Dynamic Adaptive Attention for Multi-Class Image Segmentation in Autonomous Driving
Image segmentation is a foundation for autonomous driving frameworks that empower vehicles to explore and navigate their surrounding environment. It gives a fundamental setting to the dynamic cycles by dividing the image into significant parts like streets, vehicles, walkers, and traffic signs. Prec...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/15/8162 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Image segmentation is a foundation for autonomous driving frameworks that empower vehicles to explore and navigate their surrounding environment. It gives a fundamental setting to the dynamic cycles by dividing the image into significant parts like streets, vehicles, walkers, and traffic signs. Precise segmentation ensures safe navigation and the avoidance of collisions, while following the rules of traffic is very critical for seamless operation in self-driving cars. The most recent deep learning-based image segmentation models have demonstrated impressive performance in structured environments, yet they often fall short when applied to the complex and unpredictable conditions encountered in autonomous driving. This study proposes an Adaptive Ensemble Attention (AEA) mechanism within a Generative Adversarial Network architecture to deal with dynamic and complex driving conditions. The AEA integrates the features of self, spatial, and channel attention adaptively and powerfully changes the amount of each contribution as per input and context-oriented relevance. It does this by allowing the discriminator network in GAN to evaluate the segmentation mask created by the generator. This explains the difference between real and fake masks by considering a concatenated pair of an original image and its mask. The adversarial training will prompt the generator, via the discriminator, to mask out the image in such a way that the output aligns with the expected ground truth and is also very realistic. The exchange of information between the generator and discriminator improves the quality of the segmentation. In order to check the accuracy of the proposed method, the three widely used datasets BDD100K, Cityscapes, and KITTI were selected to calculate average IoU, where the value obtained was 89.46%, 89.02%, and 88.13% respectively. These outcomes emphasize the model’s effectiveness and consistency. Overall, it achieved a remarkable accuracy of 98.94% and AUC of 98.4%, indicating strong enhancements compared to the State-of-the-art (SOTA) models. |
|---|---|
| ISSN: | 2076-3417 |