Ensemble Strategy With Multi-Step Hard Sample Mining for Improved UXO Localisation and Classification

Ensembling of multiple models to achieve better overall performance than individual ones is an approach that has yielded promising results in recent years in multiple tasks. In this article, a novel strategy based on the iterative fine-tuning on hard-to-detect instances is presented. This is impleme...

Full description

Saved in:
Bibliographic Details
Main Authors: Marian Craioveanu, Grigore Stamatescu, Dan Popescu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11078286/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ensembling of multiple models to achieve better overall performance than individual ones is an approach that has yielded promising results in recent years in multiple tasks. In this article, a novel strategy based on the iterative fine-tuning on hard-to-detect instances is presented. This is implemented specifically by oversampling instances from the training dataset resulted as false negative predictions of the previous models, to obtain expert models for different types and difficulties of the involved proprietary real-world UneXploded Ordnances (UXO) dataset. Along with the studying and integration of robust voting schemes for outputs distribution, this opens new opportunities for the localisation and classification of UXO. The purpose of this approach is to reduce false negative cases, which could lead to dangerous situations for Explosive Ordnance Disposal personnel as well as for the civilian population. Starting from the new small State-of-The-Art Real-Time Detection Transformer (RT-DETR) detection architecture for each expert model with 20 Milion parameters, the proposed ensemble strategy includes a methodology which concludes with improved results in reducing associated risks of misclassification, with an mAP of 52.4% and a Recall of 86.2% for the ensemble, surpassing the baseline model by 4.5% mAP, respectively by 3.9% Recall. Compared to a reference model with more parameters (76 million), the ensemble model shows a difference of 0.1% in mAP and 11.4% higher Recall than the reference, while having 21% fewer parameters. To enhance our overall work, the models designed to save human lives should be accessible and easy to implement in terms of computational costs which are also discussed in terms of frames per second and number of parameters, resulted from deploying the models and the ensembles in an Nvidia T4-based online environment.
ISSN:2169-3536