DuSAFNet: A Multi-Path Feature Fusion and Spectral–Temporal Attention-Based Model for Bird Audio Classification
This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global s...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Animals |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-2615/15/15/2228 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global spectro–temporal cues. Unlike single-feature approaches, DuSAFNet captures both local spectral textures and long-range temporal dependencies in Mel-spectrogram inputs and explicitly enhances inter-class separability across low, mid, and high frequency bands. On a curated dataset of 17,653 three-second recordings spanning 18 species, DuSAFNet achieves 96.88% accuracy and a 96.83% F1 score using only 6.77 M parameters and 2.275 GFLOPs. Cross-dataset evaluation on Birdsdata yields 93.74% accuracy, demonstrating robust generalization to new recording conditions. Its lightweight design and high performance make DuSAFNet well-suited for edge-device deployment and real-time alerts for rare or threatened species. This work lays the foundation for scalable, automated acoustic monitoring to inform biodiversity assessments and conservation planning. |
|---|---|
| ISSN: | 2076-2615 |