MAS-YOLOv11: An Improved Underwater Object Detection Algorithm Based on YOLOv11
To address the challenges of underwater target detection, including complex background interference, light attenuation, severe occlusion, and overlap between targets, as well as the wide-scale variation in objects, we propose MAS-YOLOv11, an improved model integrating three key enhancements: First,...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/11/3433 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | To address the challenges of underwater target detection, including complex background interference, light attenuation, severe occlusion, and overlap between targets, as well as the wide-scale variation in objects, we propose MAS-YOLOv11, an improved model integrating three key enhancements: First, we introduce the C2PSA_MSDA module, which integrates multi-scale dilated attention (MSDA) into the C2PSA module of the backbone, enhancing multi-scale feature representation via dilated convolutions and cross-scale attention. Second, an adaptive spatial feature fusion detection head (ASFFHead) replaces the original head. By employing learnable spatial weighting parameters, ASFFHead adaptively fuses features across different scales, significantly improving the robustness of multi-scale object detection. Third, we introduce a Slide Loss function with dynamic sample weighting to enhance hard sample learning. By mapping the loss weights nonlinearly to detection confidence, this mechanism effectively enhances the overall detection accuracy. The experimental results demonstrate that the improved model yields significant performance advancements on the DUO dataset: the recall rate is enhanced by 3.7%, the F1-score is elevated by 3%, and the mAP@50 and mAP@50-95 attain values of 77.4% and 55.1%, respectively, representing increases of 3.5% and 3.3% compared to the baseline model. Furthermore, the model achieves an mAP@50 of 76% on the RUOD dataset, which further corroborates its cross-domain generalization capability. |
|---|---|
| ISSN: | 1424-8220 |