Automated note annotation after bioacoustic classification: Unsupervised clustering of extracted acoustic features improves detection of a cryptic owl

Passive acoustic monitoring and machine learning are increasingly being used to survey threatened species. When automated detection models are applied to large novel datasets, false-positive detections are likely even for high-performing models, and arbitrary thresholds may result in missed detectio...

Full description

Saved in:
Bibliographic Details
Main Authors: Callan Alexander, Robert Clemens, Paul Roe, Susan Fuller
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954125002316
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Passive acoustic monitoring and machine learning are increasingly being used to survey threatened species. When automated detection models are applied to large novel datasets, false-positive detections are likely even for high-performing models, and arbitrary thresholds may result in missed detections. Manual validation of outputs is time consuming, and additional fine-scale annotation of individual notes is impractical for large datasets and difficult to automate when using passive field recordings. This research presents an acoustic monitoring pipeline which employs a multi-stage hybrid approach: initial detection using a convolutional neural network classifier, followed by segmentation and iterative unsupervised clustering of extracted acoustic features using UMAP and HDBSCAN to remove label noise. We applied the pipeline to a large acoustic dataset comprised of 2764 h of environmental recordings and test the utility of the approach on territorial calls of Australia's largest owl: the threatened Powerful Owl (Ninox strenua). The pipeline reduced the large acoustic dataset into 10,116 annotations, of which 9399 (93 %) were correctly annotated individual notes of the target species. The clustering process also eliminated 88 % of false positive detections while retaining 95 % true positives (F1 = 0.94). The approach is highly scalable, can be applied to very large acoustic datasets, and can rapidly collect note-level annotations from noisy field recordings. The acoustic features derived from this methodology identified population differences in our test dataset and enable further exploration of song structure, geographic variation, and vocal individuality. The clustering process also facilitates a semi-supervised learning approach, allowing rapid selection of uncertain examples for model improvement. The pipeline helps to address two key challenges in bioacoustic monitoring: the need for manual validation of automated detections and the difficulty of obtaining accurate note-level annotations in noisy field recordings. Adaptation of these methods to other species and vocalisations may facilitate improved detection and investigation of vocal characteristics across different populations or regions.
ISSN:1574-9541