Training marine species object detectors with synthetic images and unsupervised domain adaptation

Visual surveys by autonomous underwater vehicles (AUVs) and other underwater platforms provide a valuable method for analysing and understanding the benthic environment. Scientists can measure the presence and abundance of benthic species by manually annotating survey images with online annotation s...

Full description

Saved in:
Bibliographic Details
Main Authors: Heather Doig, Oscar Pizarro, Stefan Williams
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Marine Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmars.2025.1581778/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Visual surveys by autonomous underwater vehicles (AUVs) and other underwater platforms provide a valuable method for analysing and understanding the benthic environment. Scientists can measure the presence and abundance of benthic species by manually annotating survey images with online annotation software or other tools. Neural network object detectors can reduce the effort involved in this process by locating and classifying species of interest in the images. However, accurate object detectors often rely on large numbers of annotated training images which are not currently available for many marine applications. To address this issue, we propose a novel pipeline for generating large amounts of synthetic annotated training data for a species of interest using 3D modelling and rendering software. The detector is trained with synthetic images and annotations along with real unlabelled images to improve performance through domain adaptation. Our method is demonstrated on a sea urchin detector trained only with synthetic data, achieving a performance slightly lower than an equivalent detector trained with manually labelled real images (AP50 of 84.3 vs 92.3). Using realistic synthetic data for species or objects with few or no annotations is a promising approach to reducing the manual effort required to analyse imaging survey data.
ISSN:2296-7745