DuSAFNet: A Multi-Path Feature Fusion and Spectral–Temporal Attention-Based Model for Bird Audio Classification

This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhengyang Lu, Huan Li, Min Liu, Yibin Lin, Yao Qin, Xuanyu Wu, Nanbo Xu, Haibo Pu
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Animals
Subjects:	bird audio classification spectral–temporal attention multi-path feature fusion ArcMarginProduct passive acoustic monitoring real-time conservation
Online Access:	https://www.mdpi.com/2076-2615/15/15/2228
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global spectro–temporal cues. Unlike single-feature approaches, DuSAFNet captures both local spectral textures and long-range temporal dependencies in Mel-spectrogram inputs and explicitly enhances inter-class separability across low, mid, and high frequency bands. On a curated dataset of 17,653 three-second recordings spanning 18 species, DuSAFNet achieves 96.88% accuracy and a 96.83% F1 score using only 6.77 M parameters and 2.275 GFLOPs. Cross-dataset evaluation on Birdsdata yields 93.74% accuracy, demonstrating robust generalization to new recording conditions. Its lightweight design and high performance make DuSAFNet well-suited for edge-device deployment and real-time alerts for rare or threatened species. This work lays the foundation for scalable, automated acoustic monitoring to inform biodiversity assessments and conservation planning.
ISSN:	2076-2615

DuSAFNet: A Multi-Path Feature Fusion and Spectral–Temporal Attention-Based Model for Bird Audio Classification

Similar Items