TMBO-AOD: Transparent Mask Background Optimization for Accurate Object Detection in Large-Scale Remote-Sensing Images
Recent advancements in deep-learning and computer vision technologies, coupled with the availability of large-scale remote-sensing image datasets, have accelerated the progress of remote-sensing object detection. However, large-scale remote-sensing images typically feature extensive and complex back...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/10/1762 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Recent advancements in deep-learning and computer vision technologies, coupled with the availability of large-scale remote-sensing image datasets, have accelerated the progress of remote-sensing object detection. However, large-scale remote-sensing images typically feature extensive and complex backgrounds with small and sparsely distributed objects, which pose significant challenges to detection performance. To address this, we propose a novel framework for accurate object detection, termed transparent mask background optimization for accurate object detection (TMBO-AOD), which incorporates a clear focus module and an adaptive filtering framework. The clear focus module constructs an empirical background pool using a Gaussian distribution and introduces transparent masks to prepare for subsequent optimization stages. The adaptive filtering framework can be applied to anchor-based or anchor-free models. It dynamically adjusts the number of candidates generated based on background flags, thereby optimizing the label assignment process. This approach not only alleviates the imbalance between positive and negative samples but also enhances the efficiency of candidate generation. Furthermore, we introduce a novel separated loss function that strengthens both foreground and background consistencies. Specifically, it focuses the model’s attention on foreground objects while enabling it to learn the consistency of background features, thus improving its ability to distinguish objects from the background. We employ YOLOv8 combined with our proposed optimizations to evaluate our model in many datasets, demonstrating improvements in both accuracy and efficiency. Additionally, we validate the effectiveness of our adaptive filtering framework in both anchor-based and anchor-free methods. When implemented with YOLOv5 (anchor based), the framework reduces the candidate generation time by 48.36%, while the YOLOv8 (anchor-free) implementation achieves a 46.81% reduction, both with maintained detection accuracy. |
|---|---|
| ISSN: | 2072-4292 |